2 Planning and Validating Your Cloud Environment

In preparation for Oracle Communications ASAP cloud native deployment, you must set up and validate prerequisite software. This chapter provides information about planning, setting up, and validating the environment for ASAP cloud native deployment.

If you are already familiar with traditional ASAP, for important information on the differences introduced by ASAP cloud native, see "Differences Between ASAP Cloud Native and ASAP Traditional Deployments".

Required Components for ASAP and Order Balancer Cloud Native

To run, manage, and monitor the ASAP and Order Balancer cloud native deployment, the following components and capabilities are required. These must be configured in the cloud environment:

  • Kubernetes Cluster
  • Container Image Management
  • Helm
  • Load Balancer
  • Domain Name System (DNS)
  • Persistent Volumes
  • Secrets Management
  • Kubernetes Monitoring Toolchain
  • Application Logs and Metrics Toolchain

For details about the required versions of these components, see ASAP Compatibility Matrix.

Planning Your Cloud Native Environment

This section provides information about planning and setting up an ASAP cloud native environment. As part of preparing your environment for ASAP cloud native, you choose, install, and set up various components and services in ways that are best suited for your cloud native environment. The following sections provide information about each of those required components and services, the available options that you can choose from, and the way you must set them up for your ASAP cloud native environment.

Setting Up Your Kubernetes Cluster

For ASAP cloud native, Kubernetes worker nodes must be capable of running Linux 8.x pods with software compiled for Intel 64-bit cores. A reliable cluster must have multiple worker nodes spread over separate physical infrastructure and a very reliable cluster must have multiple master nodes spread over separate physical infrastructure.

The following diagram illustrates the Kubernetes cluster and the components that it interacts with.

ASAP cloud native requires:

  • Kubernetes
    To check the version, run the following command:
    kubectl version
  • Flannel
    To check the version, run the following command on the master node running the kube-flannel pod:
    podman images | grep flannel
    kubectl get pods --all-namespaces | grep flannel
  • Podman
    To check the version, run the following command:
    podman version

Typically, Kubernetes nodes are not used directly to run or monitor Kubernetes workloads. You must reserve worker node resources for the execution of the Kubernetes workload. However, multiple users (manual and automated) of the cluster require a point from which to access the cluster and operate on it. This can be achieved by using kubectl commands (either directly on the command line and shell scripts or through Helm) or Kubernetes APIs. For this purpose, set aside a separate host or set of hosts. Operational and administrative access to the Kubernetes cluster can be restricted to these hosts and specific users can be given named accounts on these hosts to reduce cluster exposure and promote traceability of actions.

In addition, you need the appropriate tools to connect to your overall environment, including the Kubernetes cluster. For instance, for a Container Engine for Kubernetes (OKE) cluster, you must install and configure the Oracle Cloud Infrastructure Command Line Interface.

Additional integrations may need to include appropriate NFS mounts for home directories, security lists, firewall configuration for access to the overall environment, and so on.

Kubernetes worker nodes should be configured with the recommended operating system kernel parameters listed in "Configuring a UNIX ASAP Group and User" in ASAP Installation Guide. Use the documented values as the minimum values to set for each parameter. Ensure that Linux OS kernel parameter configuration is persistent, so as to survive a reboot.

The ASAP cloud native instance, for which specification files are provided with the toolkit, requires up to 16 GB of RAM and 2 CPUs, in terms of Kubernetes worker node capacity. For more details about database sizes, see "ASAP Server Hardware Requirements" in ASAP Installation Guide. A small increment is needed for Traefik. Refer to those projects for details.

Synchronizing Time Across Servers

It is important that you synchronize the date and time across all machines that are involved in testing, including client test drivers and Kubernetes worker nodes. Oracle recommends that you do this using Network Time Protocol (NTP), rather than manual synchronization, and strongly recommends it for Production environments. Synchronization is important in inter-component communications and in capturing accurate run-time statistics.

About Container Image Management

An ASAP cloud native deployment generates a container image for ASAP. Additionally, the image is downloaded for Traefik (depending on the choice of Ingress controllers).

Oracle highly recommends that you create a private container repository and ensure that all nodes have access to that repository. The image is saved in this repository and all nodes would then have access to the repository. This may require networking changes (such as routes and proxy) and include authentication for logging in to the repository. Oracle recommends that you choose a repository that provides centralized storage and management for the container image.

Failing to ensure that all nodes have access to a centralized repository will mean that image has to be synced to the hosts manually or through custom mechanisms (for example, using scripts), which are error-prone operations as worker nodes are commissioned, decommissioned, or even rebooted. When an image on a particular worker node is not available, the pods using that image are either not scheduled to that node, wasting resources, or fail on that node. If image names and tags are kept constant (such as myapp:latest), the pod may pick up a pre-existing image of the same name and tag, leading to unexpected and hard to debug behaviors.

Installing Helm

ASAP cloud native requires Helm, which delivers reliability, productivity, consistency, and ease of use.

In an ASAP cloud native environment, using Helm enables you to achieve the following:

  • You can apply custom domain configuration by using a single and consistent mechanism, which leads to an increase in productivity. You no longer need to apply configuration changes through multiple interfaces such as WebLogic Console, WLST, and WebLogic Server MBeans.
  • Changing the ASAP domain configuration in the traditional installations is a manual and multi-step process that may lead to errors. This can be eliminated with Helm because of the following features:
    • Helm Lint allows pre-validation of syntax issues before changes are applied
    • Multiple changes can be pushed to the running instance with a single upgrade command
    • Configuration changes may map to updates across multiple Kubernetes resources (such as domain resources, config maps, and so on). With Helm, you merely update the Helm release and its responsibility to determine which Kubernetes resources are affected.
  • Including configuration in Helm charts allows the content to be managed as code, through source control, which is a fundamental principle of modern DevOps practices.

To co-exist with older Helm versions in production environments, ASAP requires Helm 3.15.3 or later saved as helm in PATH.

The following text shows sample commands for installing and validating Helm:

$ cd some-tmp-dir
$ wget https://get.helm.sh/helm-v3.15.3-linux-amd64.tar.gz
$ tar -zxvf helm-v3.15.3-linux-amd64.tar.gz
 
# Find the helm binary in the unpacked directory and move it to its desired destination. You need root user.
$ sudo mv linux-amd64/helm /usr/local/bin/helm
 
# Optional: If access to the deprecated Helm repository "stable" is required, uncomment and run
# helm repo add stable https://charts.helm.sh/stable
 
# verify Helm version
$ helm version
version.BuildInfo{Version:"v3.15.3", GitCommit:"c4e74854886b2efe3321e185578e6db9be0a6e29", GitTreeState:"clean", GoVersion:"go1.14.11"}

Helm leverages kubeconfig for users running the helm command to access the Kubernetes cluster. By default, this is $HOME/.kube/config. Helm inherits the permissions set up for this access into the cluster. You must ensure that if RBAC is configured, then sufficient cluster permissions are granted to users running Helm.

About Load Balancing and Ingress Controller

ASAP cloud native instance is running in Kubernetes. To access application endpoints, you must enable HTTP/S connectivity to the cluster through an appropriate mechanism. This mechanism must be able to route traffic to the ASAP cloud native instance in the Kubernetes cluster.

For ASAP cloud native, an ingress controller is required to expose appropriate services from the ASAP cluster and direct traffic appropriately to the cluster members. An external load balancer is an optional add-on.

Note:

ASAP does not support multiple replicas. However, if you do not want to expose Kubernetes node IP addresses to users, use a load balancer.

The ingress controller monitors the ingress objects created by the ASAP cloud native deployment, and acts on the configuration embedded in these objects to expose ASAP HTTP and HTTPS services to the external network. This is achieved using NodePort services exposed by the ingress controller.

The ingress controller must support:
  • Sticky routing (based on standard session cookie)
  • SSL termination and injecting headers into incoming traffic
Examples of such ingress controllers include Traefik, Voyager, and Nginx. The ASAP cloud native toolkit provides samples and documentation that use Traefik as the ingress controller.

An external load balancer serves to provide a highly reliable single-point access into the services exposed by the Kubernetes cluster. In this case, this would be the NodePort services exposed by the ingress controller on behalf of the ASAP cloud native instance. Using a load balancer removes the need to expose Kubernetes node IPs to the larger user base, and insulates the users from changes (in terms of nodes appearing or being decommissioned) to the Kubernetes cluster. It also serves to enforce access policies. The ASAP cloud native toolkit includes samples and documentation that show integration with Oracle Cloud Infrastructure LBaaS when Oracle OKE is used as the Kubernetes environment.

Using Traefik as the Ingress Controller

If you choose to use Traefik as the ingress controller, the Kubernetes environment must have the Traefik ingress controller installed and configured.

For more information about installing and configuring Traefik ingress controller, see "Installing the Traefik Container Image".

For details about the required version of Traefik, see ASAP Compatibility Matrix.

Using Domain Name System (DNS)

A Kubernetes cluster can have many routable entry points. Common choices are:

  • External load balancer (IP and port)
  • Ingress controller service (master node IPs and ingress port)
  • Ingress controller service (worker node IPs and ingress port)

You must identify the proper entry point for your Kubernetes cluster.

ASAP cloud native requires hostnames to be mapped to routable entrypoints into the Kubernetes cluster. Regardless of the actual entry points (external load balancer, Kubernetes master node, or worker nodes), users who need to communicate with the ASAP cloud native instances require name resolution.

The access hostnames take the prefix.domain form. prefix and domain are determined by the specifications of the ASAP cloud native configuration for a given deployment. prefix is unique to the deployment, while domain is common for multiple deployments.

The default domain in ASAP cloud native toolkit is asap.org.

For a particular deployment, as an example, this results in the following addresses:

  • dev1.wireless.asap.org (for HTTP access)
  • admin.dev1.wireless.asap.org (for WebLogic Console access)

These "hostnames" must be routable to the entry point of your Ingress Controller or Load Balancer. For a basic validation, on the systems that access the deployment, edit the local hosts file to add the following entry:

Note:

The hosts file is located in /etc/hosts on Linux and MacOS machines and in C:\Windows\System32\drivers\etc\hosts on Windows machines.
ip_address  dev1.wireless.asap.org   admin.dev1.wireless.asap.org  t3.dev1.wireless.asap.org

However, the solution of editing the hosts file is not easy to scale and coordinate across multiple users and multiple access environments. A better solution is to leverage DNS services at the enterprise level.

With DNS servers, a more efficient mechanism can be adopted. The mechanism is the creation of a domain level A-record:
A-Record: *.asap.org IP_address

If the target is not a load balancer, but the Kubernetes cluster nodes themselves, a DNS service can also insulate the user from relying on any single node IP. The DNS entry can be configured to map *.asap.org to all the current Kubernetes cluster node IP addresses. You must update this mapping as the Kubernetes cluster changes with adding a new node, removing an old node, reassigning the IP address of a node, and so on.

With these two approaches, you can set up an enterprise DNS once and modify it only infrequently.

Configuring Kubernetes Persistent Volumes

Typically, runtime artifacts in ASAP cloud native are created within the respective pod filesystems. As a result, they are lost when the pod is deleted. These artifacts include application logs and WebLogic Server logs.

While this impermanence may be acceptable for highly transient environments, it is typically desirable to have access to these artifacts outside of the lifecycle of the ASAP could native instance. It is also highly recommended to deploy a toolchain for logs to provide a centralized view with a dashboard. To allow for artifacts to survive independent of the pod, ASAP cloud native allows for them to be maintained on Kubernetes Persistent Volumes.

ASAP cloud native does not dictate the technology that supports Persistent Volumes but provides samples for NFS-based persistence. Additionally, for ASAP cloud native on an Oracle OKE cloud, you can use persistence based on File Storage Service (FSS).

Regardless of the persistence provider chosen, persistent volumes for ASAP cloud native use must be configured:
  • With accessMode ReadWriteMany
  • With a capacity to support the intended workload

Log size and retention policies can be configured as part of the shape specification.

About NFS-based Persistence

For use with ASAP cloud native, one or more NFS (Network File System) servers must be designated.

It is highly recommended to split the servers as follows:

  • At least one for the development instance and the non-sensitive test instance (for example, for Integration testing)
  • At least one for the sensitive test instance (for example, for Performance testing, Stress testing, and production staging)
  • One for the production instance

In general, ensure that the sensitive instances have dedicated NFS support, so that they do not compete for disk space or network IOPS with others.

The exported filesystems must have enough capacity to support the intended workload. Given the dynamic nature of the ASAP cloud native instances, and the fact that the ASAP logging volume is highly dependent on cartridges and on the order volume, it is prudent to put in place a set of operational mechanisms to:

  • Monitor disk usage and warn when the usage crosses a threshold
  • Clean out the artifacts that are no longer needed

If a toolchain such as ELK Stack picks up this data, then the cleanup task can be built into this process itself. As artifacts are successfully populated into the toolchain, they can be deleted from the filesystem. You must take care to only delete log files that have rolled over.

Using Kubernetes Monitoring Toolchain

A multi-node Kubernetes cluster with multiple users and an ever-changing workload require a capable set of tools to monitor and manage the cluster. There are tools that provide data, rich visualizations, and other capabilities such as alerts. ASAP cloud native does not require any particular system to be used but recommends using such a monitoring, visualization, and alerting capability.

For ASAP cloud native, the key aspects of monitoring are:

  • Worker capacity in CPU and memory. The pods take up a non-trivial amount of worker resources. For example, pods configured for production performance use 32 GB of memory.
  • Worker node disk pressure
  • Worker node network pressure
  • The health of the core Kubernetes services
  • The health of Traefik (or other load balancers in the cluster)
The name spaces and pods that ASAP cloud native uses provide a cross instance view of ASAP cloud native.

About Application Logs and Metrics Toolchain

ASAP cloud native generates all logs that traditional ASAP and WebLogic Server typically generate. The logs can be sent to a shared filesystem for retention and for retrieval by a toolchain such as Elastic Stack.

In addition, ASAP cloud native generates metrics. ASAP cloud native exposes metrics for scraping by Prometheus. These can then be processed by a metrics toolchain, with visualizations like Grafana dashboards. Dashboards and alerts can be configured to enable sustainable monitoring of multiple ASAP cloud native instances throughout their lifecycles. Performance metrics include heap utilization, threads stuck, garbage collection, and so on.

Oracle highly recommends using a toolchain to effectively monitor ASAP cloud native instance. The dynamic lifecycle in ASAP cloud native, in terms of deploying, scaling and updating an instance, requires proper monitoring and management of the database resources as well. For non-sensitive environments such as development instances and some test instances, this largely implies monitoring the tablespace usage and the disk usage, and adding disk space as needed.

Setting Up Persistent Storage

ASAP and Order Balancer cloud native can be configured to use a Kubernetes Persistent Volume to store data that needs to be retained even after a pod is terminated. This data includes application logs and WebLogic Server logs. When an instance is re-created, the same persistent volume need not be available. When persistent storage is configured in the ASAP image, these data files, which are written inside a pod are re-directed to the persistent volume.

Data from all ASAP and Order Balancer instances may be persisted, but each instance does not need a unique location for logging. Data is written to an asap-instance folder or ob-instance folder, so multiple instances can share the same end location without destroying data from other instances.

The final location for this data should be one that is directly visible to the users of ASAP and Order Balancer cloud native. The development instances may direct data to a shared file system for analysis and debugging by cartridge developers. Whereas formal test and production instances may need the data to be scraped by a logging toolchain such as EFK, which can then process the data and make it available in various forms. The recommendation, therefore, is to create a PV-PVC pair for each class of destination within a project. In this example, one for developers to access and one that feeds into a toolchain.

A PV-PVC pair would be created for each of these "destinations", that multiple instances can then share. A single PVC can be used by multiple ASAP and Order Balancer instances. The management of the PV (Persistent Volume) and PVC (Persistent Volume Claim) lifecycles is beyond the scope of ASAP and Order Balancer cloud native.

The ASAP and Order Balancer cloud native infrastructure administrator is responsible for creating and deleting PVs or for setting up dynamic volume provisioning.

The ASAP and Order Balancer cloud native project administrator is responsible for creating and deleting PVCs as per the standard documentation in a manner such that they consume the pre-created PVs or trigger the dynamic volume provisioning. The specific technology supporting the PV is also beyond the scope of ASAP and Order Balancer cloud native. However, samples for PV supported by NFS are provided.

Planning Your Container Engine for Kubernetes (OKE) Cloud Environment

This section provides information about planning your cloud environment if you want to use Oracle Cloud Infrastructure Container Engine for Kubernetes (OKE) for ASAP cloud native. Some of the components, services, and capabilities that are required and recommended for a cloud native environment are applicable to the Oracle OKE cloud environment as well.

  • Kubernetes and Container Images: You can choose from the version options available in OKE as long as the selected version conforms to the range described in the section about planning cloud native environment.
  • Container Image Management: ASAP cloud native recommends using Oracle Cloud Infrastructure Registry with OKE. Any other repository that you use must be able to serve images to the OKE environment in a quick and reliable manner. The ASAP cloud native image is of the order of 12 GB each.
  • Oracle Multitenant Database: It is strongly recommended to run Oracle DB outside of OKE, but within the same Oracle Cloud Infrastructure tenancy and the region as an Oracle DB service (BareMetal, VM, or ExaData). The database version should be 19c. You can choose between a standalone DB or a multi-node RAC.
  • Helm: Install Helm as described for the cloud native environment into the OKE cluster.
  • Persistent Volumes: Use NFS-based persistence. ASAP cloud native recommends the use of Oracle Cloud Infrastructure File Storage service in the OKE context.
  • Monitoring Toolchains: While the Oracle Cloud Infrastructure Console provides a view of the resources in the OKE cluster, it also enables you to use the Kubernetes Dashboard. Any additional monitoring capability must be built up.

Compute Disk Space Requirements

Given the size of the ASAP cloud native container image (approximately 12 GB), the size of the ASAP cloud native containers, and the volume of the ASAP logs generated, it is recommended that the OKE worker nodes have at least 60 GB of free space that the /var/lib filesystem can use. Add disk space if the worker nodes do not have the recommended free space in the /var/lib filesystem.

Work with your Oracle Cloud Infrastructure OKE administrator to ensure worker nodes have enough disk space. Common options are to use Compute shapes with larger boot volumes or to mount an Oracle Cloud Infrastructure Block Volume to /var/lib/podman.

Note:

The reference to logs in this section applies to the container logs and other infrastructure logs. The space considerations still apply even if the ASAP cloud native logs are being sent to an NFS Persistent Volume.

Connectivity Requirements

ASAP cloud native assumes the connectivity between the OKE cluster and the Oracle CDBs is LAN-equivalent in reliability, performance, and throughput. This can be achieved by creating the Oracle CDBs within the same tenancy as the OKE cluster and in the same Oracle Cloud Infrastructure region.

ASAP cloud native allows for the full range of Oracle Cloud Infrastructure "cloud-to-ground" connectivity options for integrating the OKE cluster with on-premise applications and users. Selecting, provisioning, and testing such connectivity is a critical part of adopting Oracle Cloud Infrastructure OKE.

Using Load Balancer as a Service (LBaaS)

For load balancing, you have the option of using the services available in OKE. The infrastructure for OKE is provided by Oracle's IaaS offering, Oracle Cloud Infrastructure. In OKE, the master node IP address is not exposed to the tenants. The IP addresses of the worker nodes are also not guaranteed to be static. This makes DNS mapping difficult to achieve. Additionally, it is also required to balance the load between the worker nodes. To fulfill these requirements, you can use Load Balancer as a Service (LBaaS) of Oracle Cloud Infrastructure.

You must create a Kubernetes service as per OCI LBaaS documentation to expose your Ingress controller via Load Balancer. Once this is done, you can describe the resulting service and note down the "EXTERNAL-IP" and "PORT(S)". The EXTERNAL-IP must be used for DNS mapping and in places where an access hostname-or-IP is required. PORT(S) provide the access port - the number before the colon ":" for each port set.

For additional details, see the following:

About Using Oracle Cloud Infrastructure Domain Name System (DNS) Zones

While a custom DNS service can provide the addressing needs of ASAP cloud native even when ASAP is running in OKE, you can evaluate the option of Oracle Cloud Infrastructure Domain Name System (DNS) zones capability. Configuration of DNS zones (and integration with on-premise DNS systems) is not within the scope of ASAP cloud native.

Using Persistent Volumes and File Storage Service (FSS)

In the OKE cluster, ASAP cloud native can leverage the high performance, high capacity, high reliability File Storage Service (FSS) as the backing for the persistent volumes of ASAP cloud native. There are two flavors of FSS usage in this context:
  • Allocating FSS by setting up NFS mount target
  • Native FSS

To use FSS through an NFS mount target, see instructions for allocating FSS and setting up a Mount Target in "Creating File Systems" in the Oracle Cloud Infrastructure documentation. Note down the Mount Target IP address and the storage path and use these in the ASAP cloud native instance specification as the NFS host and path. This approach is simple to set up and leverages the NFS storage provisioner that is typically available in all Kubernetes installations. However, the data flows through the mount target, which models an NFS server.

FSS can also be used natively, without requiring the NFS protocol. This can be achieved by leveraging the FSS storage provisioner supplied by OKE. The broad outline of how to do this is available in the blog post "Using File Storage Service with Container Engine for Kubernetes" on the Oracle Cloud Infrastructure blog.

Leveraging Oracle Cloud Infrastructure Services

For your OKE environment, you can leverage existing services and capabilities that are available with Oracle Cloud Infrastructure. The following table lists the Oracle Cloud Infrastructure services that you can leverage for your OKE cloud environment.

Table 2-1 Oracle Cloud Infrastructure Services for OKE Cloud Environment

Type of Service Service Indicates Mandatory / Recommended / Optional
Developer Service Container Clusters Mandatory
Developer Service Registry Recommended
Core Infrastructure Compute Instances Mandatory
Core Infrastructure File Storage Recommended
Core Infrastructure Block Volumes Optional
Core Infrastructure Networking Mandatory
Core Infrastructure Load Balancers Recommended
Core Infrastructure DNS Zones Optional
Database BareMetal, VM, and ExaData Recommended

Validating Your Cloud Environment

Before you start using your cloud environment for deploying cloud native instances, you must validate the environment to ensure that it is set up properly and that any prevailing issues are identified and resolved. This section describes the tasks that you should perform to validate your cloud environment.

You can validate your cloud environment by:

  • Performing a smoke test of the Kubernetes cluster
  • Validating the common building blocks in the Kubernetes cluster

Performing a Smoke Test

You can perform a smoke test of your Kubernetes cloud environment by running nginx. This procedure validates basic routing within the Kubernetes cluster and access from outside the environment. It also allows for an initial RBAC examination as you need to have permissions to perform the smoke test. For the smoke test, you need nginx 1.27.3 container image.

Note:

The requirement of the nginx container image for the smoke test can change over time. See the content of the deployment.yaml file in step 3 of the following procedure to determine which image is required.

To perform a smoke test:

  1. Download the nginx container image from the container registry of your choice such as Docker Hub.

    For details on managing container images, see "About Container Image Management."

  2. After obtaining the image from the container registry, upload it into your private container repository and ensure that the Kubernetes worker nodes can access the image in the repository.

    Oracle recommends that you download and save the container image to the private repository even if the worker nodes can access the container registry directly. The images in the cloud native toolkit are available only through your private repository.

  3. Run the following commands:

    kubectl apply -f https://k8s.io/examples/application/deployment.yaml # the deployment specifies two replicas
    kubectl get pods     # Must return two pods in the Running state
    kubectl expose deployment nginx-deployment --type=NodePort --name=external-nginx
    kubectl get service external-nginx    # Make a note of the external port for nginx

    These commands must run successfully and return information about the pods and the port for nginx.

  4. Open the following URL in a browser:

    http://master_IP:port/
    where:
    • master_IP is the IP address of the master node of the Kubernetes cluster or the external IP address for which routing has been set up
    • port is the external port for the external-nginx service
  5. To track which pod is responding, on each pod, modify the text message in the webpage served by nginx. In the following example, this is done for the deployment of two pods:
    $ kubectl get pods -o wide | grep nginx
    nginx-deployment-5c689d88bb-g7zvh   1/1     Running   0          1d     10.244.0.149   worker1   <none>
    nginx-deployment-5c689d88bb-r68g4   1/1     Running   0          1d     10.244.0.148   worker2   <none>
    $ cd /tmp
    $ echo "This is pod A - nginx-deployment-5c689d88bb-g7zvh - worker1" > index.html
    $ kubectl cp index.html nginx-deployment-5c689d88bb-g7zvh:/usr/share/nginx/html/index.html
    $ echo "This is pod B - nginx-deployment-5c689d88bb-r68g4 - worker2" > index.html
    $ kubectl cp index.html nginx-deployment-5c689d88bb-r68g4:/usr/share/nginx/html/index.html
    $ rm index.html
  6. Check the index.html webpage to identify which pod is serving the page.

  7. Check if you can reach all the pods by running refresh (Ctrl+R) and hard refresh (Ctrl+Shift+R) on the index.html webpage.

  8. If you see the default nginx page, instead of the page with your custom message, it indicates that the pod has restarted. If a pod restarts, the custom message on the page gets deleted.

    Identify the pod that restarted and apply the custom message for that pod.

  9. Increase the pod count by patching the deployment.

    For instance, if you have three worker nodes, run the following command:

    Note:

    Adjust the number as per your cluster. You may find you have to increase the pod count to more than your worker node count until you see at least one pod on each worker node. If this is not observed in your environment even with higher pod counts, consult your Kubernetes administrator. Meanwhile, try to get as much worker node coverage as reasonably possible.
    kubectl patch deployment nginx-deployment -p '{"spec":{"replicas":3}}' --type merge
  10. For each pod that you add, repeat step 5 to step 8.

Ensuring that all the worker nodes have at least one nginx pod in the Running state ensures that all worker nodes have access to the container registry or to your private repository.

Validating Common Building Blocks in the Kubernetes Cluster

To approach ASAP cloud native in a sustainable manner, you must validate the common building blocks that are on top of the basic Kubernetes infrastructure individually. The following sections describe how you can validate the building blocks.

Network File System (NFS)

ASAP cloud native uses Kubernetes Persistent Volumes (PV) and Persistent Volume Claims (PVC) to use a pod-remote destination filesystem for ASAP logs and performance data. By default, these artifacts are stored within a pod in Kubernetes and are not easily available for integration into a toolchain. For these to be available externally, the Kubernetes environment must implement a mechanism for fulfilling PV and PVC. The Network File System (NFS) is a common PV mechanism.

For the Kubernetes environment, identify an NFS server and create or export an NFS filesystem from it.

Ensure that this filesystem:
  • Has enough space for the ASAP logs and performance data

  • Is mountable on all the Kubernetes worker nodes

Create an nginx pod that mounts an NFS PV from the identified server. For details, see the documentation about "Kubernetes Persistent Volumes" on the Kubernetes website. This activity verifies the integration of NFS, PV/PVC, and the Kubernetes cluster. To clean up the environment, delete the nginx pod, the PVC, and the PV.

Ideally, data such as logs and JFR data is stored in the PV only until it can be retrieved into a monitoring toolchain such as Elastic Stack. The toolchain must delete the rolled over log files after processing them. This helps you to predict the size of the filesystem. You must also consider the factors such as the number of ASAP cloud native instances that will use this space, the size of those instances, the volume of orders they will process, and the volume of logs that your cartridges generate.

Validating the Load Balancer

For a development-grade environment, you can use an in-cluster software load balancer. ASAP cloud native toolkit provides documentation and samples that show you how to use Traefik to perform load balancing activities for your Kubernetes cluster.

It is not necessary to run through "Traefik Quick Start" as part of validating the environment. However, if the ASAP cloud native instances have connectivity issues with HTTP/HTTPS traffic, and the ASAP logs do not show any failures, it might be worthwhile to take a step back and validate Traefik separately using Traefik Quick Start.

A more intensive environment, such as a test, a production, a pre-production, or performance environment can additionally require a more robust load balancing service to handle the HTTP/HTTPS traffic. For such environments, Oracle recommends using a load balancing hardware that is set up outside the Kubernetes cluster. A few examples of external load balancers are Oracle Cloud Infrastructure LBaaS for OKE, Google's Network LB Service in GKE, and F5's Big-IP for private cloud. The actual selection and configuration of an external load balancer is outside the scope of ASAP cloud native itself but is an important component to sort out in the implementation of ASAP cloud native. For more details on the requirements and options, see "Integrating ASAP".

To validate the ingress controller of your choice, you can use the same nginx deployment used in the smoke test described earlier. This is valid only when run in a Kubernetes cluster where multiple worker nodes are available to take the workload.

To perform a smoke test of your ingress setup:

  1. Run the following commands:
    kubectl apply -f https://k8s.io/examples/application/deployment.yaml
    kubectl get pods -o wide    # two nginx pods in Running state; ensure these are on different worker nodes
    cat > smoke-internal-nginx-svc.yaml <<EOF
    apiVersion: v1
    kind: Service
    metadata:
      name: smoke-internal-nginx
      namespace: default
    spec:
      ports:
      - port: 80
        protocol: TCP
        targetPort: 80
      selector:
        app: nginx
      sessionAffinity: None
      type: ClusterIP
    EOF
    kubectl apply -f ./smoke-internal-nginx-svc.yaml
    kubectl get svc smoke-internal-nginx
  2. Create your ingress targeting the internal-nginx service. The following text shows a sample ingress annotated to work with the Traefik ingress controller:
    apiVersion: extensions/v1beta1
    kind: Ingress
    metadata:
      annotations:
        kubernetes.io/ingress.class: traefik
      name: smoke-nginx-ingress
      namespace: default
    spec:
      rules:
      - host: smoke.nginx.asaptest.org
        http:
          paths:
          - backend:
              serviceName: smoke-internal-nginx
              servicePort: 80

    If the Traefik ingress controller is configured to monitor the default name space, then Traefik creates a reverse proxy and the load balancer for the nginx deployment. For more details, see Traefik documentation.

    If you plan to use other ingress controllers, refer to the documentation about the corresponding controllers for information on creating the appropriate ingress and make it known to the controller. The ingress definition should be largely reusable, with ingress controller vendors describing their own annotations that should be specified, instead of the Traefik annotation used in the example.

  3. Create a local DNS/hosts entry in your client system mapping smoke.nginx.asaptest.org to the IP address of the cluster, which is typically the IP address of the Kubernetes master node, but could be configured differently.

  4. Open the following URL in a browser:

    http://smoke.nginx.asaptest.org:Traefik_Port/

    where Traefik_Port is the external port that Traefik has been configured to expose.

  5. Verify that the web address opens and displays the nginx default page.

Your ingress controller must support session stickiness for ASAP cloud native. To learn how stickiness should be configured, refer to the documentation about the ingress controller you choose. For Traefik, stickiness must be set up at the service level itself. For testing purposes, you can modify the internal-nginx service to enable stickiness by running the following commands:
kubectl delete ingress smoke-nginx-ingress
vi smoke-internal-nginx-svc.yaml
# Add an annotation section under the metadata section:
#   annotation:
#     traefik.ingress.kubernetes.io/affinity: "true"
kubectl apply -f ./smoke-internal-nginx-svc.yaml
# now apply back the ingress smoke-nginx-ingress using the above yaml definition

Other ingress controllers may have different configuration requirements for session stickiness. Once you have configured your ingress controller, and smoke-nginx-ingress and smoke-internal-nginx services as required, repeat the browser-based procedure to verify and confirm if nginx is still reachable. As you refresh (Ctrl+R) the browser, you should see the page getting served by one of the pods. Repeatedly refreshing the web page should show the same pod servicing the access request.

To further test session stickiness, you can either do a hard refresh (Ctrl+Shift+R) or restart your browser (you may have to use the browser in Incognito or Private mode), or clear your browser cache for the access hostname for your Kubernetes cluster. You may observe that the same nginx pod or a different pod is servicing the request. Refreshing the page repeatedly should stick with the same pod while hard refreshes should switch to the other pod occasionally. As the deployment has two pods, the chances of a switch with a hard refresh are 50%. You can modify the deployment to increase the number of replica nginx pods (controlled by the replicas parameter under spec) to increase the odds of a switch. For example, with four nginx pods in the deployment, the odds of a switch with hard refresh rise to 75%. Before testing with the new pods, run the commands for identifying the pods to add unique identification to the new pods. See the procedure in "Performing a Smoke Test" for the commands.

To clean up the environment after the test, delete the following services and the deployment:

  • smoke-nginx-ingress
  • smoke-internal-nginx
  • nginx-deployment