Trip to Tobermory - a perfect getaway within driving distance of Toronto

My family's no-labouring plan for this year's Labour Day long weekend, included some cool & quality time at the warm Sauble beach, overnight stay in traditionally Scottish town of Kincardine as well as a trip to the majestic lakeside town called Tobermory. It was a short (two days only) plan for a long weekend. We intentionally made it two days only, so that we could return back home on Monday and finish any last minute, left behind preparation for a new school year starting from Tuesday.

    As seen from Ferry Terminal - Lake Huron and Islands nearby
                                         
Originally, our plan was to visit Tobermory for the full duration of the weekend, and soak in the beauty of the place, however because of an unsuccessful attempt to book a hotel (Expedia and hotels.com only showed listings that famously said - we are sold out), we shifted our attention to next near by town called Owen Sound. However I found myself in a similar situation and had no luck in finding a hotel in that city as well. At one point I was even thinking about giving up on the long weekend idea, but one of my friends gave  me the brilliant idea of booking a hotel in Kincardine, Ontario and I did. Early Saturday morning, we were to drive about 2 hours and 50 minutes from Markham to Sauble beach. However, thanks to my daughters' research of Kincardine, we changed the plan at the last minute. They found very good reviews about Station beach in Kincardine and convinced us to head directly there, rather than going to Sauble beach. The fact that we had been to Sauble beach before also helped to make that decision.
      On Saturday morning, we left home around 7:50 am. The first few hours of our morning was foggy, but the scene began to look better as it slowly lifted. Sometimes even few hours of driving (it is about two and half hours of  drive from Toronto to Kincardine) can be boring. However, if you have cheerful kids in the car, it's sure be a lot of fun. We enjoyed watching countryside while driving. We even played "I spy with my little eyes ... Even though my kids are teenagers now, they love playing this game on trips ,which keeps us all engaged so we don’t miss out on the little things." 

Conquering Kincardine

We were at Station beach by 11:30 am. I immediately fell in love with the beach because of it's clean pristine water, and long sandy shoreline. It is so conveniently located just in walking distance from the downtown and parking is free. As soon as I touched the water, I felt a deep need to submerge myself in it immediately. It was amazing to swim there. The beach wasn’t too crowded, which made it the perfect place to relax and take a breather. No matter what your idea of relaxing is, you can achieve there. You can swim, sleep, read, or maybe play beach volleyball. We had brought food from home, so we enjoyed little picnic there. I even had a good nap while sunbathing. My kids went for a walk along the boardwalk (a great place to walk, jog and enjoy the view) and took beautiful pictures and videos. The boardwalk became a walk & learn experience for them  as it  had  interpretive signs with information about local marine history and shipwrecks.

Station Beach - even birds are sunbathing here!!!


      Marina at Station beach
               Interpretive sign
Tips:

  • Station beach is located at 151 Station Beach Road, Kincardine ON
  • As per www.canoe.ca, this water is listed as one of the top 9 destinations for surfing.
  • For those with mobility issues, there are 'MOBI-Mats' at Station Beach. The mats stretch right to the waters edge. 
  • If you enjoy playing beach volleyball, then there is co-ed beach volleyball every Friday at 7 p.m
  • The park has a bouncy castle for kids to enjoy
  • The lighthouse and museum are just a 2 to 3 minute walk way and lakeside downtown Kincardine is just nearby.

After spending almost three and half hours on the beach, we drove to the hotel (Sutton Park Inn). The check-in process was smooth and the hotel was nice. After taking shower and having a fresh cup of coffee, we went out for a tour to downtown Kincardine. We parked our car, in the harbour street, near the lighthouse and went on a walk along Queen street. Kincardine is a small town located on the shores of Lake Huron in Bruce County and has strong Scottish heritage. I was told that during summer, every Saturday night, people in Kincardine celebrate and take part in Scottish Pipe Band Parades.

Queen Street, downtown Kincardine

If you have a sweet tooth, you should visit this little chocolate shop (Mill Creek Chocolates) located at 813 Queen street. They Offer the handmade chocolates in very personalized packaging finished with colourful wraps and ribbons. You can enjoy the chocolates yourself or bring as a gift. You'll love them.

Mill Creek Chocolates Shop at Queen street.
Mill Creek Chocolates showroom

We also had chance to have some friendly conversation with the sales lady. She told us a little history of Kincardine, the Mill Creek Chocolates as well as how she ended up living in Kincardine. Originally from Brampton Ontario, she once visited Kincardine, liked the town and since then has been living in Kincardine.
     I had read that Station beach had one of the most beautiful sunsets. But unfortunately the evening became cloudy around 6:30 PM and looked like it was going to rain. Frustrated a little bit, we instead went to see the lighthouse which we found very fitting to the aesthetic of the small town. Around 7:00 pm, we were about to head back to the hotel, it started raining.  It rained all night long but we were prepared for this. We had brought board game <<Monopoly>> with us. We played until about 11:00 pm and had tons of fun while snacking the Mill Creek chocolates.
     It's no secret that when I initially booked my hotel in Kincardine, the purpose was to use it just as a transit point. However, now, I have no regrets whatsoever. We enjoyed it fully.


Trip to Tobermory


The next day (Sunday) we got up on time to have a complimentary breakfast at the hotel, and after packing our bags we started driving to Tobermory. It was around a two hours drive. We reached Tobermory at around 11:00 AM. We purchased our tickets (be prepared to spend time in ticket queue. Even if you have pre-purchased your ticket online, you still have to stay in line to check-in and to get a parking permit for your car. They do have a separate window to serve the holders of pre-purchased tickets) for cruise (Bruce Anchor company - 7468 HWY 6).
Notes:
  • The Bruce Anchor company offers few options. See their web site for more details.
  • Ticket price also includes the parking fee and they have free shuttle from the parking spot to the Ferry Terminal and back.
  • There are other boat and cruise services as well. Check the following:
      We booked the Tobermory Explorer option and enjoyed the view (while staying aboard) of shipwrecks lighthouses, and several beautiful islands in the Fathom Five National Marine Park.

Bruce Anchor boat cruise
Big Tub Lighthouse

It was around 75 minutes of fun-filled cruising. The cruise moves at a slow speed around the "Little Tub Harbour", where you can see the sunken ships (Note: Tobermory is home to over 20 historic shipwrecks) through the glass window or glass bottom. Speed bumps up once it is out of Little Tub Harbour, water comes flying through the window and you'll get taste of fresh of water Lake Huron.

Sunken ship as seen from glass window of cruise.
One of the flower pot rock pillars in Flowerpot island

If you choose the drop off option to Flowerpot island, you need to follow few rules in order not to damage natural environment (see instruction in picture below). Looks like the name - Flowerpot comes from two rock pillars on it's shore, which exactly look like flower pots. The island itself is about about two square kilometres in area and is a popular tourist destination. Activities like swimming, hiking and camping are allowed there.

Flowerpot island - visitor information
One of the flower pot rock pillars in Flowerpot island

After returning from the cruise, we went for a walk on a trail in the woods, which eventually led us to the lake. We stayed there for about an hour - swimming and watching the waves of blue water continuously hitting the big rocks on the shore. My kids didn't want to return and asked if we could stay to stay for another day or so. But I had to refuse as we didn't have a hotel booked for another night.
      So, we started driving back home. We drove a good three hours and made a stop along the way for some food and drinks. It was around 10:30 PM, when we arrived home.
     For us, it turned out to be a perfect getaway within driving distance of Toronto. I would definitely recommend a trip to Tobermory for anyone! It is an amazing escape that fits the budget and time. 

How to Create, Troubleshoot and Use NFS type Persistent Storage Volume in Kubernetes

Whether you need to simply persists the data or share data among pods, one of the options is to use Network File System (NFS) type Persistent Volumes (PV).
However, you may encounter multiple issues and a lot of times error message(s) you see in the pod's log not detailed enough or even misleading. In this blog post, I'm going to show you step by step process (with real example) of creating PV, Persistence Volume Claims (PVC) and use them in a pod. We'll also discuss the possible issues and how to resolve them.

Prerequisites for this exercise:

  1. Make sure you have working Kubernetes cluster where you can create resources as needed. 
  2. Make sure you have a working Network File System (NFS) server and is accessible from all Kubernetes nodes in the Kubernetes cluster.

Process steps:

1) Allow Kubernetes pod/container to use NFS

1.1) Check, if selinux is enabled on your Kubernetes cluster nodes/hosts (where Kubernetes pod(s) will be created). If it is enabled, we need to make sure it lets container/pod to access remote NFS share.

$> sestatus
SELinux status: enabled
SELinuxfs mount: /sys/fs/selinux
SELinux root directory: /etc/selinux
Loaded policy name: targeted
Current mode: enforcing
Mode from config file: enforcing
Policy MLS status: enabled
Policy deny_unknown status: allowed
Max kernel policy version: 28

1.2) If it is enabled, find out the value of 'virt_use_nfs'. You can use either 'getsebool' or 'semanage' utilities as shown below:

$> getsebool virt_use_nfs
virt_use_nfs --> off
or
$> sudo semanage boolean -l | grep virt_use_nfs
virt_use_nfs (off , off) Allow virt to use nfs

1.3) If value of 'virt_use_nfs' is 'off', make sure to enable it; otherwise, any attempt by Kubernetes pod to access NFS share may be denied and you may get '403 Forbidden error' from your application.You can use 'setsebool' tool to set value as '1' or 'on'

$> sudo setsebool -P virt_use_nfs 1

$> sudo semanage boolean -l | grep virt_use_nfs
virt_use_nfs (on , on) Allow virt to use nfs

Note: -P option is to set the value permanently.

2) Create NFS share on NFS server

2.1) create  a directory on NFS server. My NFS server's IP is 192.168.56.101. Here I'm creating directory '/var/rabbitmq' on NFS server as a NFS share and assigning the ownership to 'osboxes:osboxes'. We'll discuss the ownership of the share and it's relationship to pod/container security context little later in the post.

# Create directory to be shared.
sudo mkdir -p /var/rabbitmq


# Change the ownership
$> sudo chown osboxes:osboxes /var/rabbitmq


Important: The right ownership of the NFS share is crucial.

2.2) Add NFS share in /etc/exports file. Below, I'm adding all of my kubernetes nodes. Pods running on 192.168.56.101-103 will be able to access the NFS share. 'root_squash' option "squashes" the power of the remote root user to the lowest local user, preventing unauthorized alterations.

/var/rabbitmq/ 192.168.56.101(rw,sync,root_squash)
/var/rabbitmq/ 192.168.56.102(rw,sync,root_squash)
/var/rabbitmq/ 192.168.56.103(rw,sync,root_squash)


2.3) Export the NFS share.

sudo exportfs -a


3) Provisioning of PV and PVC

Let's create a PersistentVolume (PV), PersistentVolumeClaim (PVC) for RabbitMQ.
Note: it's important that the PVC and pod that uses it to be in the same namespace. You can create them all in default namespace. However, here I'm going to create a dedicated namespace for this purpose.

3.1) Create a new namespace or use existing one or default namespace.
Below yaml file (shared-services-ns.yml) defines a namespace object called 'shared-services':

apiVersion: v1
kind: Namespace
metadata:
   name: shared-services

To create the “shared-services” namespace, run the following command:

# Create a new namespace:
$> kubectl create -f shared-services-ns.yml
namespace "shared-services" created

# Verify namespace is created successfully
$> kubectl get namespaces shared-services
NAME              STATUS    AGE
shared-services   Active    36s

3.2) Create a new service account or use existing one or default:
If a service account is not set in the pod definition, the pod uses the default service account for the namespace. Here we are defining a new service account called 'shared-svc-accnt'. File: svcAccnt.yml

apiVersion: v1
kind: ServiceAccount
metadata:
   name: shared-svc-accnt
   namespace: shared-services

To create a new service account 'shared-svc-accnt', run the following command:

# Create service account
$> kubectl create -f svcAccnt.yml
serviceaccount "shared-svc-accnt" created

# Verify service account
$> kubectl describe serviceaccount shared-svc-accnt -n shared-services
Name:                shared-svc-accnt
Namespace:           shared-services
Labels:              
Annotations:         
Image pull secrets:  
Mountable secrets:   shared-svc-accnt-token-mgk9w
Tokens:              shared-svc-accnt-token-mgk9w
Events:              

3.3) Assign role/permission to service account:
Once, service account is created, make sure to provide necessary access permission to service account in the given namespace. Based on your Kubernetes platform, you may do it differently. Since, my Kubernetes is part of Docker Enterprise Edition (EE), I do it through Docker Universal Control Plane (UCP) as described in https://docs.docker.com/ee/ucp/authorization/grant-permissions/#kubernetes-grants. I'll assign 'restricted control' role to my service account 'shared-svc-accnt' in namespace 'shared-services'. If you are using MiniKube or other platform, you may want to refer to generic Kuberentes documents for RBAC and service account permission. Basically, you need to basically create the cluster role(s) and bind it to the service account. Here are some links to corresponding documentation. See https://v1-7.docs.kubernetes.io/docs/admin/authorization/rbac/#service-account-permissions and https://kubernetes.io/docs/reference/access-authn-authz/rbac/#role-and-clusterrole

3.4) Define PV object in a yaml file (rabbitmq-nfs-pv.yml):

apiVersion: v1
kind: PersistentVolume
metadata:
  name: rabbitmq-nfs-pv
  namespace: shared-services
spec:
  capacity:
    storage: 5Gi
  accessModes:
  - ReadWriteMany
  nfs:
    path: /var/rabbitmq/
    server: 192.168.56.101
  persistentVolumeReclaimPolicy: Retain

Note: currently a PVcan have “Retain”, “Recycle”, or “Delete” reclaim policies. For dynamically provisioned PV, the default reclaim policy is “Delete”. Kubernetes supports following access modes:

  • ReadWriteOnce – the volume can be mounted as read-write by a single node
  • ReadOnlyMany – the volume can be mounted read-only by many nodes
  • ReadWriteMany – the volume can be mounted as read-write by many nodes

To create a new PV 'rabbitmq-nfs-pv', run the following command:

# Create PV
$> kubectl create -f rabbitmq-nfs-pv.yml
persistentvolume "rabbitmq-nfs-pv" created

# Verify PV
$> kubectl describe pv rabbitmq-nfs-pv
Name:            rabbitmq-nfs-pv
Labels:          
Annotations:     
Finalizers:      []
StorageClass:
Status:          Available
Claim:
Reclaim Policy:  Retain
Access Modes:    RWX
Capacity:        5Gi
Node Affinity:   
Message:
Source:
    Type:      NFS (an NFS mount that lasts the lifetime of a pod)
    Server:    192.168.56.101
    Path:      /var/rabbitmq/
    ReadOnly:  false
Events:        

3.5) Define PVC object in a yaml file ( rabbitmq-nfs-pvc.yml):

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: rabbitmq-nfs-pvc
  namespace: shared-services
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 5Gi

Note: make sure to create PVC in the same namespace as your pod(s) that use it.

To create a new PVC 'rabbitmq-nfs-pvc', run the following command:

# Create PVC
$> kubectl create -f rabbitmq-nfs-pvc.yml
persistentvolumeclaim "rabbitmq-nfs-pvc" created

# Verify PVC
$> kubectl describe pvc rabbitmq-nfs-pvc -n shared-services
Name:          rabbitmq-nfs-pvc
Namespace:     shared-services
StorageClass:
Status:        Bound
Volume:        rabbitmq-nfs-pv
Labels:        
Annotations:   pv.kubernetes.io/bind-completed=yes
               pv.kubernetes.io/bound-by-controller=yes
Finalizers:    []
Capacity:      5Gi
Access Modes:  RWX
Events:        

Important: see the status above. It's "Bound" and it's bound to volume "rabbitqm-nfs-pv" that we created in previous step. If your PVC is not able to bind with PV, then it's a problem. It could be problem in defining the PV and PVC. Make sure your PV and PVC are of same storage class (if you are using one. For details refer to https://kubernetes.io/docs/concepts/storage/storage-classes/), and PV can fully satisfy the specification defined in PVC.


3.7) Now let's put together a simple yaml file that defines service and deployment objects for RabbitMQ (rabbitmq-nfs-pv-poc-depl.yml):

apiVersion: v1
kind: Service
metadata:
  name: rabbitmq-nfs-poc-svc
  namespace: shared-services
  labels:
    app: rabbitmq-nfs-poc-svc
spec:
  type: NodePort
  ports:
  - name: http
    port: 15672
    targetPort: 15672
  - name: amqp
    protocol: TCP
    port: 5672
    targetPort: 5672
  selector:
    app: rabbitmq-app
---
apiVersion: apps/v1beta2 # for versions prior to 1.9.0
kind: Deployment
metadata:
  name: rabbitmq-depl
  namespace: shared-services
spec:
  selector:
    matchLabels:
      app: rabbitmq-app
  replicas: 1
  template:
    metadata:
      labels:
        app: rabbitmq-app
    spec:
      serviceAccountName: shared-svc-accnt
      securityContext:
        runAsUser: 1000
        supplementalGroups: [1000,65534]
      containers:
      - name: rabbitmq-cnt
        image: rabbitmq
        imagePullPolicy: IfNotPresent
        #privileged: false
        #securityContext:
          #runAsUser: 1000
        ports:
        - containerPort: 15672
          name: http-port
          protocol: TCP
        - containerPort: 5672
          name: amqp
          protocol: TCP
        volumeMounts:
          # 'name' must match the volume name below.
          - name: rabbitmq-mnt
            # Where to mount the volume.
            mountPath: "/var/lib/rabbitmq/"
      volumes:
      - name: rabbitmq-mnt
        persistentVolumeClaim:
          claimName: rabbitmq-nfs-pvc
 

Note:
As seen in the rabbitmq-nfs-pv-poc-depl.yml above, I'm defining the security context in the pod level as:

securityContext:
  runAsUser: 1000
  supplementalGroups: [1000,65534]

Here runAsUser's value '1000' and supplementalGroups' value '1000' belong to user 'osboxes' and group 'osboxes'. gid '65534' belongs to group 'nfsnobody'.

$> id osboxes
uid=1000(osboxes) gid=1000(osboxes) groups=1000(osboxes),10(wheel),983(docker)

$> id nfsnobody
uid=65534(nfsnobody) gid=65534(nfsnobody) groups=65534(nfsnobody)

My NFS share '/var/rabbitmq' is owned by 'osboxes:osboxes', so I'm specifying those values that belong to osboxes in the securityContext.

Security context can be defined both on pod level as well as container level. Security context defined in the pod level is applied to all containers in the pod. https://kubernetes.io/docs/tasks/configure-pod-container/security-context/ has details about configuring security context for pod or container.


Following command creates rabbitmq deployment and service:

# Create objects 
$> kubectl create -f rabbitmq-nfs-pv-poc-depl.yml
service "rabbitmq-nfs-poc-svc" created
deployment.apps "rabbitmq-depl" created

# Get pods $> kubectl get pods -n shared-services
NAME                            READY     STATUS    RESTARTS   AGE
rabbitmq-depl-775496b9b-d85l7   1/1       Running   0          7s


Let's check the rabbitmq processes inside the container and files under '/var/rabbitmq' share on NFS server.

# Check process inside the container
$> kubectl exec -it rabbitmq-depl-775496b9b-d85l7 /bin/bash -n shared-services
$> ps -ef
UID        PID  PPID  C STIME TTY          TIME CMD
1000         1     0  0 12:38 ?        00:00:00 /bin/sh /usr/lib/rabbitmq/bin/rabbitmq-server
1000       162     1  0 12:38 ?        00:00:00 /usr/lib/erlang/erts-9.3.3.2/bin/epmd -daemon
1000       321     1  5 12:38 ?        00:00:03 /usr/lib/erlang/erts-9.3.3.2/bin/beam.smp -W w -

# Connect to NFS server 


$> ssh osboxes@192.168.56.101
Last login: Sun Aug 26 14:48:19 2018 from centosddcclnt

Make sure rabbitmq successfully created the file and review the file ownership
$> cd /var/rabbitmq
$> ls -la
total 28
drwxr-xr-x.  5 osboxes osboxes   4096 Aug 26 13:40 .
drwxr-xr-x. 25 root    root      4096 Aug 26 13:34 ..
-rw-------.  1 osboxes nfsnobody   40 Aug 26 13:40 .bash_history
drwxr-xr-x.  3 osboxes nfsnobody 4096 Aug 26 13:38 config
-r--------.  1 osboxes nfsnobody   20 Aug 26 01:00 .erlang.cookie
drwxr-xr-x.  4 osboxes nfsnobody 4096 Aug 26 13:38 mnesia
drwxr-xr-x.  2 osboxes nfsnobody 4096 Aug 26 13:38 schema



4) Possible issues & troubleshooting

4.1) Pod remain in pending state and pod description shows 'mount failed: exit status 32' as shown below:

$> kubectl describe pod rabbitmq-shared-app -n shared-services
Name:         rabbitmq-shared-app
Namespace:    shared-services
Node:         centosddcwrk01/192.168.56.103
Start Time:   Thu, 16 Aug 2018 17:03:19 +0100
Labels:       name=rabbitmq-shared-app
Annotations:  
Status:       Pending
IP:
  ...
  ...
...
Events:
  Type     Reason                 Age   From                   Message
  ----     ------                 ----  ----                   -------
  ...
  Warning  FailedMount            50s   kubelet, centosddcucp  MountVolume.SetUp failed for volume .... : mount failed: exit status 32

If you try to run the mount manually from inside the container, you may see following:

$> kubectl exec -it rabbitmq-depl-bd9689c8-7md48 /bin/bash -n shared-services
root@rabbitmq-depl-bd9689c8-7md48:/# pwd
/


root@rabbitmq-depl-bd9689c8-7md48:/# mount -t nfs 192.168.56.101:/var/rabbitmq /tmp/test
mount: wrong fs type, bad option, bad superblock on 192.168.56.101:/var/rabbitmq,
       missing codepage or helper program, or other error
       (for several filesystems (e.g. nfs, cifs) you might
       need a /sbin/mount. helper program)

       In some cases useful info is found in syslog - try
       dmesg | tail or so.

In this case, review the '/etc/exports' file on NFS server.  This file controls which file systems are exported to remote hosts and specifies options. If your Kubernetes host/node is not listed
in this file with appropriate option(s), a pod running on that node will not be able to mount. Make sure to run the command 'sudo exportfs -a' once you have updated the /etc/exports. You can also try to manually mount from your host (instead of from within the container) in order to test if that host/node is authorized to mount. Refer to https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/5/html/deployment_guide/s1-nfs-server-config-exports for details.


4.2) Pod fails to instantiate and you see 'chown: changing ownership of '/var/lib/rabbitmq': Operation not permitted' error in the log as shown below:

$> kubectl create -f rabbitmq-nfs-pv-poc-depl.yml
service "rabbitmq-nfs-poc-svc" created
deployment.apps "rabbitmq-depl" created

$> kubectl get pods -n shared-services
NAME                             READY     STATUS             RESTARTS   AGE
rabbitmq-depl-5fff645d95-429vd   0/1       CrashLoopBackOff   1          14s

$> kubectl logs rabbitmq-depl-5fff645d95-429vd -n shared-services
chown: changing ownership of '/var/lib/rabbitmq': Operation not permitted

This means that the pod is able to mount successfully, however, it's not able to change the ownership of file/directory. The easiest way to resolve this issue is to have a common user that owns NFS share on NFS server and runAsUser of Kubernetes pod. For example, for this demo, I have used 'osboxes' user which owns the NFS share and also use this user's uid '1000' in the pod level security context.

$> ls -lZ /var/rabbitmq
drwxr-xr-x. osboxes nfsnobody system_u:object_r:var_t:s0       ...

$> id osboxes
uid=1000(osboxes) gid=1000(osboxes) groups=1000(osboxes),10(wheel),983(docker)

In reality, it may not be that easy. You may not have access to remote NFS server or system administrator of NFS server is not willing to change the ownership of NFS share on NFS server. In this case (as a work-around), you can use 'root' as runAsUser like below in the container level:

securityContext:
  runAsUser: 0

However, for this to work properly, the /etc/exports file on NFS server should not squash (use 'no_root_squash') the root. It should look something like this:

/var/rabbitmq/ 192.168.56.103(rw,sync,no_root_squash)

'no_root_squash' has it's own security consequences. See details here https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/5/html/deployment_guide/s1-nfs-server-config-exports

In summary, in order to grant pod's access to PVs you need to take considerations of:

  • Finding the group ID and/or user ID assigned to the actual storage (on NFS server)
  • SELinux considerations,
  • Also making sure that the IDs allowed to access physical storage match the requirements of the particular pod.

The Group IDs, the user ID, and SELinux values can be defined in the pod's SecurityContext section. User IDs can also be defined to each container. So, in short you can use the following user, group and options to control and find the right combination:
  • supplementalGroups
  • fsGroup
  • runAsUser
  • seLinuxOptions

Hope, it helps you a little bit!

Note: yaml files used in this post can be downloaded from Github location: https://github.com/pppoudel/kube-pv-pvc-demo

Upgrade to Docker EE 2.0 and UCP 3.x for Choice of Swarm or Kubernetes Orchestration

Docker Enterprise Edition (EE) 2.0 has introduced integrated Kubernetes orchestration engine along with SWARM. Since Kubernetes is installed and configured as part of the of the upgrade to Docker EE 2.0 and Universal Control Plane (UCP) 3.x, it saves a lot of time which otherwise is needed to install and setup Kubernetes environment.


In this blog post, I'm discussing the upgrade process (not going to go through each step though. Because official Docker documentation is detail enough for this) and going to direct you to the right documentation and also discuss few issues that I encountered during the upgrade and how I resolved them.


Planning for Upgrade

1) Prerequisite check for hardware/software - Docker recommends at least 8 GB of physical memory available on UCP and Docker Trusted Registry (DTR) nodes and 4 GB for other worker nodes. See details hardware and software requirement here: https://docs.docker.com/ee/ucp/admin/install/system-requirements/

2) Firewall ports - since Kubernetes master and worker nodes will be part of the upgraded environment, additional ports required for Kubernetes need to open. Details on port used can be found here: https://docs.docker.com/ee/ucp/admin/install/system-requirements/#ports-used. I put together few lines of shell script to open firewall ports (uses firewall-cmd utility). Use/modify it as needed.

openFWPortsForDockerEE.sh

#!/bin/sh
# openFWPortsForDockerEE.sh
# Opens required ports for Docker EE 2.0/UCP 3.x
# Ref:
# https://docs.docker.com/ee/ucp/admin/install/system-requirements/#ports-used
# https://docs.docker.com/datacenter/ucp/2.1/guides/admin/install/system-requirements/#network-requirements
tcp_ports="179,443,80,2375,2376,2377,2380,4001,4443,4789,6443,6444,7001,7946,8080,10250,12376-12387"
udp_ports="4789,7946"

openFW() {
   IFS=",";
   for _port in $1; do
      echo "Opening ${_port}/$2";
      sudo firewall-cmd --permanent --zone=public --add-port=${_port}/$2;
   done
   IFS=" ";
}

openFW "${tcp_ports}" tcp;
openFW "${udp_ports}" udp;

# Recycle firewall
sudo firewall-cmd --reload;

Backup Docker EE

You need to backup Docker Swarm, UCP, and DTR . Please follow this document (https://docs.docker.com/ee/backup/) for backup.

Upgrade Docker Engine

Very well documented step by step process can be found here: https://docs.docker.com/ee/upgrade/#upgrade-docker-engine

Upgrade UCP

UCP can be upgraded from UCP Web user interface (Web UI) or  command line interface (CLI). Both options are documented here: https://docs.docker.com/ee/ucp/admin/install/upgrade/#use-the-cli-to-perform-an-upgrade.

Note: If all possible try to use CLI instead of Web UI. I had upgraded my personal DEV environment using CLI and did not encounter any issue, however, one of my colleagues initially tried to use Web UI and  had issue. The upgrade process went forever, and failed.

Note: If you have less than 4 GB of memory, you'll get warning during the upgrade. It may complete successfully (as you see below) or may fail. So, it is best practice to fulfil the minimum requirement whenever possible. Below is output from my UCP 3.0 upgrade:

$> sudo docker container run --rm -it --name ucp -v /var/run/docker.sock:/var/run/docker.sock docker/ucp:3.0.0 upgrade --interactive

INFO[0000] Your engine version 17.06.2-ee-10, build 66261a0 (3.10.0-514.el7.x86_64) is compatible
FATA[0000] Your system does not have enough memory. UCP suggests a minimum of 4.00 GB, but you only have 2.92 GB. You may have unexpected errors. You may proceed by specifying the '--force-minimums' fla g, but you may experience scale and performance problems as a result
[osboxes@centosddcucp scripts]$ sudo docker container run --rm -it --name ucp -v /var/run/docker.sock:/var/run/docker.sock docker/ucp:3.0.0 upgrade --interactive --force-minimums
INFO[0000] Your engine version 17.06.2-ee-10, build 66261a0 (3.10.0-514.el7.x86_64) is compatible
WARN[0000] Your system does not have enough memory. UCP suggests a minimum of 4.00 GB, but you only have 2.92 GB. You may have unexpected errors.
WARN[0002] Your system uses devicemapper. We can not accurately detect available storage space. Please make sure you have at least 3.00 GB available in /var/lib/docker
INFO[0006] Upgrade the UCP 3.0.0 installation on this cluster to 3.0.0 for UCP ID: nufs9fb696bs6rm4kxaauewly
INFO[0006] Once this operation completes, all nodes in this cluster will be upgraded.
Do you want proceed with the upgrade? (y/n): y
INFO[0017] Pulling required images... (this may take a while)
INFO[0017] Pulling docker/ucp-interlock:3.0.0
INFO[0048] Pulling docker/ucp-compose:3.0.0
INFO[0130] Pulling docker/ucp-dsinfo:3.0.0
INFO[0183] Pulling docker/ucp-interlock-extension:3.0.0
WARN[0000] Your system does not have enough memory. UCP suggests a minimum of 4.00 GB, but you only have 2.92 GB. You may have unexpected errors.
WARN[0002] Your system uses devicemapper. We can not accurately detect available storage space. Please make sure you have at least 3.00 GB available in /var/lib/docker
INFO[0007] Checking for version compatibility
INFO[0007] Updating configuration for Interlock service
INFO[0038] Updating configuration for existing UCP service
INFO[0141] Waiting for cluster to finish upgrading
INFO[0146] Success! Please log in to the UCP console to verify your system.

Note: You may also find your upgrade to UCP 3.x process getting stuck while updating ucp-kv, just like we had in one of our environments. The symptom and resolution are documented here: https://success.docker.com/article/upgrade-to-ucp-3-gets-stuck-updating-ucp-kv


After the Upgrade

If you run 'docker ps' after upgrade on UCP host, all UCP related processes (like docker/ucp-*) should be of version '3.x', if you notice any of those processes still in version '2.x', meaning upgrade is not quite successful. You can also run 'docker version' and make sure the output shows 'ucp/3.x'

If your upgrade is successful, after the upgrade, you are going to notice few things right way, some of them are listed below:

1) UCP Web UI looks different now. You are going to see Kubernetes and related resources standing out as the first class citizen.

2) You may also notice that your application is not accessible any more even though corresponding service(s) may seem to be running (specifically, if you used HTTP Routing Mesh (HRM) before the upgrade). We encountered an issue (related to HRM) in our DEV environment. Before the upgrade, we had something like this configuration (fragment from  our yaml file):

version: "3.1"
services:
   testsvc:
      ...
      ...
      ports:
         - "9080"
         - "9443"
      deploy:
         ...
         ...
         labels:
            - "com.docker.ucp.mesh.http.9080=external_route=http://testsvc.devdte.com:8080,internal_port=9080"
            - "com.docker.ucp.mesh.http.9443=external_route=sni://testsvc.devdte.com:8443,internal_port=9443"
...
...



As shown above, internal port 9080 is mapped to external port 8080 (http) and internal port 9443 is mapped to external port 8443 (https) and 'testsvc.devdte.com' is configured as a host. And our routing mesh setting looked like as shown below:


Before the upgrade, the above configuration allowed us to access the service as shown below:

  • http://testsvc.devdte.com:8080/xxx
    or
  • https://testsvc.devdte.com:/8443/xxx

However, after the upgrade, we could access the application only on port 8443. If you encounter similar issue, refer to Layer 7 routing upgrade for more details.


3) Another interesting issue we encountered after the upgrade was related to HTTP header parameter being rejected. One of our applications relied on HTTP header parameter and the parameter had a underscore '_' (something like 'user_name'). After the upgrade, suddenly, application started responding with HTTP status code 502. After investigation, we found out that the Nginx - that's a part of Layer 7 routing solution, was silently rejecting this parameter because it had underscore '_'. Refer to my blog How to override Kubernetes Ingress-Nginx-Controller and Docker UCP Layer 7 Routing Configuration for details.

4) Lastly, if you are planning to use Kubernetes orchestration and 'kubectl' utility to connect to Kubernetes master, you need to download your client certificates bundle again. env.sh/env.cmd has been updated to set Kubernetes cluster, context and credentials configuration so that 'kubectl' command can securely establish connection to Kubernetes master and be able to communicate. Refer to CLI based access and Install the Kubernetes CLI for more details. Once you have installed 'kubectl' and downloaded and extracted client certificates bundle, test connectivity to Kubernetes master as follows:

# Change directory to the folder where you extracted you client certificates bundle
# and run following command to set kubernetes context, credentials and cluster configuration

$> eval "$(<env.sh)"
Cluster "ucp_ddcucphost:6443_ppoudel" set.
User "ucp_ddcucphost:6443_ppoudel" set.
Context "ucp_ddcucphost:6443_ppoudel" created.

# Confirm the connection to UCP. You should see something like this:


$> kubectl config current-context
ucp_ddcucphost:6443_ppoudel

# Inspect Kubernetes resources

$> kubectl get all
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 443/TCP 6d


How to override Kubernetes Ingress-Nginx-Controller and Docker UCP Layer 7 Routing Configuration

One of our dockerized applications mysteriously stopped working after we upgraded to Docker Enterprise Edition (EE) 2.0/Universal Control Plane (UCP) 3.x. After investigation, we found out that the Nginx that is being used as part of Docker Layer 7 routing solution was silently dropping  HTTP header parameter (refer to Missing (disappearing) HTTP Headers) which had underscore '_' (something like 'user_name') in it and our application required the value from that HTTP header parameter in order to function correctly. Note: our name based virtual hosting relied in Docker Layer 7 routing solution.
Later on, as part of migration to Kubernetes from Docker SWARM, we again encountered this issue as we were using Kubernetes' Ingress-Nginx-Controller.
In this post, I'm going to show how to resolve this issue whether it is with Docker UCP Layer 7 routing or Kubernetes' Ingress-Nginx-Controller.


Overriding Kubernetes' Ingress-Nginx-Controller configuration

Create a configMap as shown below. In this example, I'm overriding the 'underscores_in_headers' Nginx configuration to 'on' from default 'off'. Refer to this post to see what parameters are allowed in configMap.

ingress-nginx-config.yml
apiVersion: v1
kind: ConfigMap
data:
   enable-underscores-in-headers: "true"
metadata:
   name: nginx-configuration
   namespace: ingress-nginx
   labels:
      app: ingress-nginx

The key here is:
data:
   enable-underscores-in-headers: "true"


If you have existing configMap object 'nginx-configuration', then you can edit and update the parameter's value that you want to override. If configMap object does not exist, then you can create it using 'kubectl' as shown below, however, make sure you are referring this configMap object in your controller's container spec.

#edit
$> kubectl edit configMap/nginx-configuration -n ingress-nginx
# It opens the configuration into your editor, you can update any configuration and save. Saving the yaml will update the resource in the API server.

# Create
$> kubectl create -f ingress-nginx-config.yml -n ingress-nginx

In order to verify whether the configuration of ingress-nginx-controller has been updated, you can do the following:
  1. Find the ingress-nginx-controller pod using following command;
    $>kubectl get pods -n ingress-nginx
  2. See nginx.conf file and make sure the parameter you are overriding has been updated. In this case we are looking underscores_in_headers value updated from 'off' to 'on'
    $> kubectl exec nginx-ingress-controller-68db848949-ncvj7 -n ingress-nginx cat /etc/nginx/nginx.conf | grep underscores_in_headers
    underscores_in_headers on;


Overriding/customizing Docker Layer 7 routing solution configuration 

The following steps you can using Docker CLI. Make sure, secure connection has been established from where you are running Docker CLI to UCP. You can do it using Client Certificate Bundle.


  1. # export current ucp-interlock configuration to CURRENT_CONFIG_NAME variable
    $> CURRENT_CONFIG_NAME=$(docker service inspect --format '{{ (index .Spec.TaskTemplate.ContainerSpec.Configs 0).ConfigName }}' ucp-interlock)


  2. # Write information to config.toml file
    $> docker config inspect --format '{{ printf "%s" .Spec.Data }}' $CURRENT_CONFIG_NAME > config.toml

  3. # Update config.toml as below. In this case we are overriding the value of nginx
    # configuration 'underscores_in_headers' from 'off' to 'on' by changing ucp-interlock service
    # configuration 'UnderscoresInHeaders' value from 'false' to 'true'

  4. # Create updated config object
    $> docker config create UPDATED_CONFIG_NAME config.toml

  5. # Verify the object created:
    $> docker config ls
    ID NAME CREATED UPDATED
    061xu64qyotlbtrdz9l5e1s0h UPDATED_CONFIG_NAME 6 seconds ago 6 seconds ago

  6. # Update the ucp-interlock service to start using the new configuration:
    $> docker service update \
    --config-rm $CURRENT_CONFIG_NAME \
    --config-add source=$UPDATED_CONFIG_NAME,target=/config.toml \
    ucp-interlock

  7. # Wait for a minute, make sure interlock service started successfully. Look the timestamp
    $> docker ps | grep interlock

  8. # Rollback (if necessary)
    $> docker service update --update-failure-action rollback ucp-interlock

Note: the above steps can be used to update/override any other Layer 7 routing configuration. Refer to Layer 7 routing configuration reference to find out all other configurable properties.

Note: Everytime you restart (disable/enable) the Layer 7 routing solution from UCP UI, it starts with default configuration, so you have to perform above steps again to override the configuration.

Port Conflict and Resolution in WebSphere Application Server and Liberty Profile

Port conflict is one of the frequent issues that integrator and administrator encounter. In this post, I'm going to discuss how to resolve port conflict in WebSphere Application Server Traditional (WASt) and WebSphere Liberty Profile (WLP). Recently, while doing a Proof of Concept (PoC) work, I had to install and launch a WLP server on the same host where WASt was running and I received following error in the WLP's messages.log:

[4/15/18 16:59:00:048 BST] 00000017 ibm.ws.transport.iiop.security.config.ssl.yoko.SocketFactory E CWWKS9580E: The server socket could not be opened on localhost:2,809.  The exception message is Address already in use (Bind failed).
[3/19/18 9:42:51:037 EDT] 00000017 LogService-271-com.ibm.ws.management.j2ee.mejb               E CWWKE0701E: [com.ibm.ws.management.j2ee.mejb.service.ManagementEJBService(77)] The setServerStarted method has thrown an exception Bundle:com.ibm.ws.management.j2ee.mejb(id=271) java.lang.IllegalStateException: com.ibm.ws.ejbcontainer.osgi.internal.EJBRuntimeException: com.ibm.ws.exception.RuntimeError: java.lang.IllegalStateException: The orb is not available
        at com.ibm.ws.ejbcontainer.osgi.internal.EJBContainerImpl.startSystemModule(EJBContainerImpl.java:230)
        at com.ibm.ws.management.j2ee.mejb.service.ManagementEJBService.startManagementEJB(ManagementEJBService.java:161)
        ...
        Caused by: com.ibm.ws.ejbcontainer.osgi.internal.EJBRuntimeException: com.ibm.ws.exception.RuntimeError: java.lang.IllegalStateException: The orb is not available
        at com.ibm.ws.ejbcontainer.osgi.internal.EJBRuntimeImpl.startSystemModule(EJBRuntimeImpl.java:968)
        at com.ibm.ws.ejbcontainer.osgi.internal.EJBContainerImpl.startSystemModule(EJBContainerImpl.java:228)
        ... 39 more
Caused by: com.ibm.ws.exception.RuntimeError: java.lang.IllegalStateException: The orb is not available
        at com.ibm.ws.ejbcontainer.runtime.AbstractEJBRuntime.startModule(AbstractEJBRuntime.java:587)
        at com.ibm.ws.ejbcontainer.osgi.internal.EJBRuntimeImpl.startSystemModule(EJBRuntimeImpl.java:964)
        ... 40 more
Caused by: java.lang.IllegalStateException: The orb is not available
        at com.ibm.ws.ejbcontainer.remote.internal.EJBRemoteRuntimeImpl.bind(EJBRemoteRuntimeImpl.java:189)

As per error message above, the WLP servr could not start object request broker (ORB) service, because port 2809 is in use. Note: WLP (by default) uses port 2809 for ORB.
Next step is to find out which process is (already) using this port, you can use netstat -ntlp | grep <port> like netstat -ntlp | grep 2809.
Once you find out the PID (process id) that is listening on that particular port, you can get more detail about the process by using ps -ef | grep <PID> In my case the port 2809 was being used by WASt Nodeagent process.
Now, I had two choices: either change the WASt Nodeagent process's port number or change the IIOP Endpoint's port in WLP server. I have tested both and below I'm going to show both options.

Change the iiopEndpoint port for WLP server


It is simple, just open the server.xml for the WLP server and add following lines. Here I'm changing iiopport to 2709 and iiopsport to 9403.
Note: Make sure those new ports are not currently being used by any process. You can find out whether they are currently being used using command: netstat -na | egrep '(2709|9403)'

<iiopendpoint id="defaultIiopEndpoint" iiopport="2709">
   <iiopsoptions iiopsport="9403" sslref="defaultSSLConfig">
   </iiopsoptions>

</iiopendpoint>

Refer to https://www.ibm.com/support/knowledgecenter/en/SS7K4U_liberty/com.ibm.websphere.wlp.zseries.doc/ae/rwlp_portnums.html to find out WLP default port numbers.
Restart the WLP server after updating the port number in server.xml and review the messages.log again to make sure port binding errors gone.

Update/Change the port on WASt side


You can change/update port on WASt side two ways. Either using Administration console or by using wsadmin command. To find out more details about the PortManagement command group for the AdminTask object, refer to https://www.ibm.com/support/knowledgecenter/en/SSAW57_8.5.5/com.ibm.websphere.nd.doc/ae/rxml_atportmgt.html

Using wsadmin command:

  1. Connect to Dmgr
    $> ./wsadmin.sh
    WASX7209I: Connected to process "dmgr" on node ubuntuwas9CellManager01 using SOAP connector; The type of process is: DeploymentManager
    WASX7031I: For help, enter: "print Help.help()"
  2. Find out the current port in question. In my case, it is for nodeagent
    wsadmin>AdminTask.listServerPorts('nodeagent', '[-nodeName ubuntuwas9Node02]') 

    u'[[IPC_CONNECTOR_ADDRESS [[[host localhost] [node ubuntuwas9Node02] [server nodeagent] [port 9629] ]]] ]\n[[CSIV2_SSL_SERVERAUTH_LISTENER_ADDRESS [[[host ubuntuwas9] [node ubuntuwas9Node02] [server nodeagent] [port 9201] ]]] ]\n[[XDAGENT_PORT [[[host ubuntuwas9] [node ubuntuwas9Node02] [server nodeagent] [port 7062] ]]] ]\n[[OVERLAY_UDP_LISTENER_ADDRESS [[[host ubuntuwas9] [node ubuntuwas9Node02] [server nodeagent] [port 11003] ]]] ]\n[[DCS_UNICAST_ADDRESS [[[host *] [node ubuntuwas9Node02] [server nodeagent] [port 9353] ]]] ]\n[[NODE_DISCOVERY_ADDRESS [[[host ubuntuwas9] [node ubuntuwas9Node02] [server nodeagent] [port 7272] ]]] ]\n[[BOOTSTRAP_ADDRESS [[[host ubuntuwas9] [node ubuntuwas9Node02] [server nodeagent] [port 2809] ]]] ]\n[[NODE_IPV6_MULTICAST_DISCOVERY_ADDRESS [[[host ff01::1] [node ubuntuwas9Node02] [server nodeagent] [port 5001] ]]] ]\n[[SAS_SSL_SERVERAUTH_LISTENER_ADDRESS [[[host ubuntuwas9] [node ubuntuwas9Node02] [server nodeagent] [port 9901] ]]] ]\n[[SOAP_CONNECTOR_ADDRESS [[[host ubuntuwas9] [node ubuntuwas9Node02] [server nodeagent] [port 8878] ]]] ]\n[[NODE_MULTICAST_DISCOVERY_ADDRESS [[[host 232.133.104.73] [node ubuntuwas9Node02] [server nodeagent] [port 5000] ]]] ]\n[[ORB_LISTENER_ADDRESS [[[host ubuntuwas9][node ubuntuwas9Node02] [server nodeagent] [port 9900] ]]] ]\n[[CSIV2_SSL_MUTUALAUTH_LISTENER_ADDRESS [[[host ubuntuwas9] [node ubuntuwas9Node02] [server nodeagent] [port 9202] ]]] ]\n[[OVERLAY_TCP_LISTENER_ADDRESS [[[host ubuntuwas9] [node ubuntuwas9Node02] [server nodeagent] [port 11004] ]]] ]'
  3. As seen from command output above, the nodeagent is listening on port 2809, let's change  it to 2709.
    wsadmin>AdminTask.modifyServerPort ('nodeagent', '[-nodeName ubuntuwas9Node02 -endPointName BOOTSTRAP_ADDRESS -port 2709 -modifyShared true]')

    u'[[BOOTSTRAP_ADDRESS [[[host ubuntuwas9] [node ubuntuwas9Node02] [server nodeagent] [port 2709] ]]] ]'
  4. Save the changes:
    wsadmin>AdminConfig.save()
  5. If it's a federated environment, make sure to synchronise the configuration with the node(s). If your Nodeagent is currently stopped, go to the host machine where Nodeagent is installed and run syncNode.sh <dmgr-host> as shown below:
    wasadmin@ubuntuwas9:/opt/ibm/WebSphere/AppServer/profiles/AppSrv01/bin$ ./syncNode.sh ubuntuwas9 

    ADMU0116I: Tool information is being logged in file /opt/ibm/WebSphere/AppServer/profiles/AppSrv01/logs/syncNode.log
    ADMU0128I: Starting tool with the AppSrv01 profile
    ADMU0401I: Begin syncNode operation for node ubuntuwas9Node02 with Deployment
    Manager ubuntuwas9: 8879
    ADMU0016I: Synchronizing configuration between node and cell.
    ADMU0402I: The configuration for node ubuntuwas9Node02 has been synchronized with Deployment Manager ubuntuwas9: 8879

    If your nodeagent is running. You can use the following command to sync:

    wsadmin>dmgrObj=AdminControl.queryNames('WebSphere:type=DeploymentManager,*') wsadmin>AdminControl.invoke(dmgrObj, 'multiSync', '[false]', '[java.lang.Boolean]')
  6. (Re)start the Nodeagent:
    if nodeagent is stopped, start the nodeagent:
    Note: you can use the stop[start]Node.sh command from profile_root/bin directory. ./startNode.sh
    if it is currently running, stop and start it:
    ./stopNode.sh
    ./startNode.sh

Using WASt Administration console:

  1. Access WASt Administration console: https://<host>:<port>/admin
  2. Once logged in, on the left panel, navigate to "SystemAdministration" and click on "Node agents"
  3. On the right hand side, click on "Ports" link under "Additional Properties". You can see a table that contains port name and port numbers.
  4. Click on "Details" button beside the table in order to update any port.
  5. Select any and click on any port name link.
  6. Editable screen appears, update the port and click "Apply"
  7. Make sure to Synchronise the change with the nodes.
  8. Restart the affected server. (Nodeagent in this case).

Hope, this post will be helpful in your next port conflict resolution!

It's in You to Give

Today (March 27, 2018), I proudly wore my jacket with the Lapel Pin of Canadian Blood Services  on it again. And yes, I donated blood for the 2nd time, and hoping to do so much more often going forward. Don't get me wrong, this post is not really to celebrate my donation, but to encourage others like myself who are just starting to donate or thinking about it. We all need to come forward and do this noble thing, because our people, communities and countries need blood all the time.

It does not cost anything. As the Canadian Blood Services puts it in simple words, "It's in you to give." Surprisingly, blood donors get some health benefits as well. See Donor health benefits section of Wikipedia.
Finally, voluntary blood donation is a very important concept and we need to support it. See World Health Organisation's paper entitled "Towards 100% Voluntary Blood Donation - A Global Framework for Action".
Believe me, it's not hard. If I can do it, anyone can. Just make sure you're well hydrated before sitting in for the donation. If you are in Canada call the Canadian Blood Services at 1 888 2 DONATE (1-888-236-6283) or visit their website at www.blood.ca to schedule your appointment. If you are in any other jurisdiction, contact your national blood services to donate!

Update as of July 05, 2019:
Cheers again! Donated today for the third time and felt awesome.

How to Parse Apache error_log for Troubleshooting & Reporting


Note: if you haven't already, see Log Parsing, Analysis, Correlation, and Reporting Engine post first.

Apache error_log can be useful while troubleshooting production problem. So parsing and analysing the content of this file regularly helps in maintaining the overall health of the system. If mod_mpmstats enabled, error_log also contains Multi-Processing Modules (MPM) stats data. MPM stats can be used for both troubleshooting and performance tuning. http://publib.boulder.ibm.com/httpserv/manual70/mod/mod_mpmstats.html provides more details about MPM stats.
Since, error_log does not contain the Web server name, in order to co-relate the data to corresponding Web server, it is advisable that you put error_log files for each Web server under corresponding directory, named after the Web server. It is specially important when you are parsing logs from multiple Web servers. Script takes directory name as Web server name for the purpose of reporting and analysis. For example, let's say, you have Web servers 'webSrv01, webSrv02, webSrv03 ... etc., then put logs from each Web server under corresponding directories as shown below:

 /tmp/webSrv01
    error_log
    error_log_2017.09.05.log
    access_log
 /tmp/webSrv02
    error_log
    error_log_2017.09.05.log
    access_log

The naming suffix for historical files can be different from one environment to another. So, if you have different suffix for historical files, you can tweak the find script. Currently the fragment of script that finds error_log looks like this:

find $rootcontext -name "error_log*" -type f | grep "$logFileName"
where $rootcontext is root path.

Review the actual script available in github - https://github.com/pppoudel/log-parser/blob/master/webErrorLogParser.sh for details.

Note: script is written to parse the date format like '[Thu Dec 14 08:13:08 2017]' in error_log. If your error_log uses different date format, you may need to tweak the section of script which parses the date.

How to execute:
You can see all the available options, by just launching:
$> ./webErrorLogParser.sh

See below for few examples:
# processing current day's logs
$> ./webErrorLogParser.sh --rootcontext <log-path>

# processing yesterday's logs with historical report updates
$> ./webErrorLogParser.sh --rootcontext <log-path> --rpttype daily

# processing any day's logs updates
$> ./webErrorLogParser.sh --rootcontext <log-path> --recorddate <date in (YYYY-MM-DD) format>


Output
Report/Output files:
  • $rptDir/00_Alert.txt
  • $rptDir/03_WebErrorLogSummaryRpt.txt
  • $rptDir/WebErrorLogMpmStatsRpt_all.csv
  • $rptDir/WebErrorLogRpt_all.csv
Where $rptDir is report directory. Default value is $TMP/$recDate

History Report/Output files:
# These are historical reports. Each run will append record in existing report file.
  • $pDir/RecycleHistoryRpt_all.csv
  • $pDir/MPMStatsHistoryRpt.csv
Where $pDir is parent of $rptDir.

See sample summary report in github - https://github.com/pppoudel/log-parser/blob/master/sample_reports/03_WebErrorLogSummaryRpt.txt
And here is a sample MPM stats report https://github.com/pppoudel/log-parser/blob/master/sample_reports/WebErrorLogMpmStatsRpt_all.csv

See my other posts in this series
  1. websphereLogParser.sh for parsing, analyzing and reporting WebSphere Application Server (WAS) SystemOut.log
  2. webAccessLogParser.sh for parsing, analyzing and reporting Apache/IBM HTTP Server (IHS) access_log
  3. javaGCStatsParser.sh for parsing, analyzing and reporting Java verbose Garbage Collection (GC) log