Tuesday, September 27, 2016

How to do advance level OSGI debug to resolve dependency conflicts

The reason behind writing this article was due to the intermittent issue[1] reported in WSO2 EMM server startup. In order to find the root cause I had to debug the code at OSGI level to identify the conflicts, which gave a great length of knowledge on how OSGI works. Therefore I thought of sharing the knowledge I gained through the process.

Problem:
When starting the EMM server sometimes webapps don't get deploy due to ClassNotFoundException. The webapps refer some classes in org.wso2.carbon.device.mgt.common OSGI bundle.

Reasoning (Thinking pattern):
Since this is an intermittent issue, the first thing that comes to our mind is that sometimes webapps get deployed before OSGI resolve the org.wso2.carbon.device.mgt.common bundle. But WSO2 platform is built in a way that webapp deployments always happen after the server startup. Therefore the next doubt that comes to our mind is dynamic imports. When we set the resolution "optional" for dynamic import in OSGI configurations, OSGI will resolve the bundle first and later try to resolve the dynamic imports (dependencies). At this time, for a particular dynamic import, if OSGI encounter any conflicts, it will skip resolving the dynamic import and try again later after iterating through the list. This cause might delay the class loading and result in ClassNotFoundException for webapps.

Debugging:
How can we debug this issue? 
There are several ways to debug...
  1. Start the server with OSGI console and investigate
  2. Enabling OSGI level debug logs
  3. Debugging the Equinox OSGI source 
1) Start the server with OSGI console and investigate
You can start the server OSGI console wih below command.
           sh wso2server.sh -DosgiConsole

Inspecting the org.wso2.carbon.device.mgt.common bundle status.

osgi> ss device.mgt.common
id State       Bundle
262 ACTIVE      org.wso2.carbon.device.mgt.common_1.2.2.SNAPSHOT

osgi> b 262
org.wso2.carbon.device.mgt.common_1.2.2.SNAPSHOT [262]
  Id=262, Status=ACTIVE      Data Root=/home/daag/work/product-emm/modules/distribution/target/wso2emm-2.2.0-SNAPSHOT/repository/components/default/configuration/org.eclipse.osgi/bundles/262/data
  "No registered services."
  No services in use.
  Exported packages
    org.wso2.carbon.device.mgt.common.configuration.mgt; version="1.2.2.SNAPSHOT"[exported]
    org.wso2.carbon.device.mgt.common.permission.mgt; version="1.2.2.SNAPSHOT"[exported]
    org.wso2.carbon.device.mgt.common.scope.mgt; version="1.2.2.SNAPSHOT"[exported]
    org.wso2.carbon.device.mgt.common.app.mgt; version="1.2.2.SNAPSHOT"[exported]
    org.wso2.carbon.device.mgt.common.authorization; version="1.2.2.SNAPSHOT"[exported]
    org.wso2.carbon.device.mgt.common.operation.mgt; version="1.2.2.SNAPSHOT"[exported]
    org.wso2.carbon.device.mgt.common.notification.mgt; version="1.2.2.SNAPSHOT"[exported]
    org.wso2.carbon.device.mgt.common.license.mgt; version="1.2.2.SNAPSHOT"[exported]
    org.wso2.carbon.device.mgt.common; version="1.2.2.SNAPSHOT"[exported]
    org.wso2.carbon.device.mgt.common.spi; version="1.2.2.SNAPSHOT"[exported]
    org.wso2.carbon.device.mgt.common.device.details; version="1.2.2.SNAPSHOT"[exported]
    org.wso2.carbon.device.mgt.common.push.notification; version="1.2.2.SNAPSHOT"[exported]
    org.wso2.carbon.device.mgt.common.group.mgt; version="1.2.2.SNAPSHOT"[exported]
    org.wso2.carbon.device.mgt.common.search; version="1.2.2.SNAPSHOT"[exported]
  Imported packages
    com.fasterxml.jackson.annotation; version="2.6.1.wso2v1"<jackson-annotations_2.6.1.wso2v1 [72]>
    javax.xml.bind.annotation; version="0.0.0"<org.eclipse.osgi_3.9.1.v20130814-1242 [0]>
    org.wso2.carbon.apimgt.api.model; version="6.0.5"<org.wso2.carbon.apimgt.api_6.0.5 [209]>
    io.swagger.annotations; version="[1.5.0,2.0.0)"<unwired><optional>
  No fragment bundles
  Named class space
    org.wso2.carbon.device.mgt.common; bundle-version="1.2.2.SNAPSHOT"[provided]
  No required bundles

With above logs, we can ensure the org.wso2.carbon.device.mgt.common bundle is in active state. Therefore we need to go deep in debugging to investigate the root cause.

2) Enabling OSGI level debug logs in EMM server.
We can use this method to track the dynamic class loading in OSGI.
See <EMM_HOME>/repositories/conf/etc/osgi-debug.options file. You will see there are many levels of OSGI debug options. for our scenario, we can enable the below debug options.

### OSGi resolver options
# Turns on debugging for the resolver
org.eclipse.osgi/resolver/debug = true
# Prints out wiring information after the resolver has completed the resolve process
org.eclipse.osgi/resolver/wiring = true
# Prints out Import-Package information
org.eclipse.osgi/resolver/imports = true

After enabling those debug logs, start the server with bellow command.

sh wso2server.sh -DosgiDebugOptions

You will start seeing large number of debug logs printed in the console. By tracking these logs we can get an idea of how things happen.

Ex: Following line shows that the dynamic class loading for org.wso2.carbon.device.mgt.common bundle fails over and over due to dependency issue.

Failed to resolve dynamic import: org.wso2.carbon.device.mgt.common

So how to track which dynamic import cause this issue?... For this, we will have to go to 2nd level debugging with Equinox OSGI source

3) Debugging the Equinox OSGI source 

Load it to IntelliJ Idea.
  • CTRL+N
  • Type the class name "GroupingChecker"
  • Once opened click on the "Choose sources..." on the top right corner of the file and load the downloaded source jar.

Now we can set the debug point. The place we need to debug is org.eclipse.osgi.internal.module.GroupingChecker#isConsistentInternal method.



Here it checks for any dependency conflicts. If there are any dependency conflicts, those will be returned to results list object. Else it returns null. Since this method gets hit very frequent for every bundle that gets loaded, we need to add a condition to this debug point.
  • Right click on the debug point (red dot) and add the following condition.
matchingExport.getBaseDescription().getName().equals("org.wso2.carbon.device.mgt.common") && results != null


Now when you start the server with remote debug enabled. If there are any conflicts for org.wso2.carbon.device.mgt.common bundle, that information will be returned to results list object and the debug point will get triggered.

In my case, after multiple server restarts, I was able to see a version conflict with the com.fasterxml.jackson.annotation dynamic import bundle.

When I check this in the server OSGI console with below command, I saw that it has loaded 2 versions (2.7.4 and 2.6.1wso2v1) of the same bundle.

sh wso2server.sh -DosgiConsole

osgi> p com.fasterxml.jackson.annotation

The org.wso2.carbon.device.mgt.common bundle is configured to use the latest version of the bundle. When OSGI framework first loads the 2.6.1wso2v1 version, the common bundle fails to do the dynamic import. This result in not loading the classes and eventually triggers the CNF Exception.

Solution: 
Updating the versions back to one single version resolved the issue.

[1] https://wso2.org/jira/browse/EMM-1646

Hope you learned something new from this investigation. 
Happy debugging !!!