Skip to content

Add your own detection criteria

maaaaz edited this page Jul 22, 2012 · 11 revisions

This How-To describes the procedure in order to add a new criteria to the androwarn analysis-synthesis chain. It should allow you to quickly integrate your own detection criteria to Androwarn.

I recommend you to take a look at this wiki entry to have a brief overview of the project structure.

Before going any further, a brief reminder about the current malicious behaviors categories :

* Telephony identifiers exfiltration: IMEI, IMSI, MCC, MNC, LAC, CID, operator's name...
* Device settings exfiltration: software version, usage statistics, system settings, logs...
* Geolocation information leakage: GPS/WiFi geolocation...
* Connection interfaces information exfiltration: WiFi credentials, Bluetooth MAC adress...
* Telephony services abuse: premium SMS sending, phone call composition...
* Audio/video flow interception: call recording, video capture...
* Remote connection establishment: socket open call, Bluetooth pairing, APN settings edit...
* PIM data leakage: contacts, calendar, SMS, mails...
* External memory operations: file access on SD card...
* PIM data modification: add/delete contacts, calendar events...
* Arbitrary code execution: native code using JNI, UNIX command, privilege escalation...
* Denial of Service: event notification deactivation, file deletion, process killing, virtual keyboard disable, terminal shutdown/reboot...

Firstly, you should wonder "Does my new detection criteria concern an existing category or do I need to create a new category ?".
Depending on the answer, the procedure differs a bit.

During this tutorial, I will take the example of adding a new category named "Telephony services abuse". If you have read the reminder above, you know that this category is already present in the current Androwarn version.

If your new criteria can be included in an existing category, you don't need to create a new category, so please ignore the steps #1, #2 and #3.

1. Create a new category

The data hierarchy for synthesis can be found in androwarn/androwarn/analysis/analysis.py in the perform_analysis() function.

def perform_analysis(apk_file, a, d, x, no_connection) :
    """
        @param apk_file         : apk file path
        @param a                : an APK instance, DalvikVMFormat, and VMAnalysis objects
        @param d                : a DalvikVMFormat instance
        @param x                : a VMAnalysis instance
        @param no_connection    : boolean value, enable/disable online lookups
    
        @rtype : a list of dictionaries of strings lists [ { "application_information": [ ("application_name", ["com.test"]), ("application_version", ["1.0"]) ] }, { ... }]
    """
    # application general information 
    app_package_name = grab_application_package_name(a)
    app_name, app_desc, app_icon = grab_application_name_description_icon(app_package_name, no_connection)
    app_description = [app_icon, app_desc]
    
    # data gathering
    data = []
    
    data.append(
                { 'application_information' :
                    [
                        ( 'application_name',                       [app_name] ),
                        ( 'application_version',                    [grab_androidversion_name(a)] ),
                        ( 'package_name',                           [app_package_name] ),
                        ( 'description',                             app_description )
                    ]
                }
    )
    
    data.append(
                { 'analysis_results' :
                    [
                        ( 'telephony_identifiers_leakage',           gather_telephony_identifiers_leakage(x) ),
                        ( 'device_settings_harvesting',              gather_device_settings_harvesting(x) ),
                        ( 'location_lookup',                         gather_location_lookup(x) ),
                        ( 'connection_interfaces_exfiltration',      gather_connection_interfaces_exfiltration(x) ),
                        ( 'telephony_services_abuse',                gather_telephony_services_abuse(x) ),                                      
                        ( 'audio_video_eavesdropping',               gather_audio_video_eavesdropping(x) ),
                        ( 'suspicious_connection_establishment',     gather_suspicious_connection_establishment(x) ),
                        ( 'PIM_data_leakage',                        gather_PIM_data_leakage(x) ),
                        ( 'code_execution',                          gather_code_execution(x) ),
                    ],
                }
    )
    
    data.append(
                { 'apk_file' :
                    [
                        ( 'apk_file_name',                          [grab_filename(a)] ),
                        ( 'SHA-1_hash',                             [grab_apk_file_sha1_hash(apk_file)] ),
                        ( 'file_list',                               grab_file_list(a) ),
                        ( 'certificate_information',                 grab_certificate_information(a) )
                    ]
                }
    )   
    
    data.append(
                { 'androidmanifest.xml' :
                    [
                        ( 'main_activity',                          [grab_main_activity(a)] ),
                        ( 'activities',                              grab_activities(a) ),
                        ( 'receivers',                               grab_services(a) ),
                        ( 'providers',                               grab_providers(a) ),
                        ( 'permissions',                             grab_permissions(a) ),
                        ( 'features',                                grab_features(a) ),
                        ( 'libraries',                               grab_libraries(a) )
                    ]
                }
    )

    data.append(
                { 'apis_used' :
                    [
                        ( 'classes_list',                            grab_classes_list(x) ),
                        ( 'internal_classes_list',                   grab_internal_classes_list(x) ),
                        ( 'external_classes_list',                   grab_external_classes_list(x) ),
                        ( 'internal_packages_list',                  grab_internal_packages_list(x) ),
                        ( 'external_packages_list',                  grab_external_packages_list(x) )
                    ]
                }
    )   
    
    return data

The hierarchy is quite simple: analysis_results stands for the first level and all the subitems tuple such as telephony_identifiers_leakage are on the second level.
You can see that each item is mapped to a corresponding gather_XXX function such as gather_telephony_identifiers_leakage(x). This function gathers each 'intelligible user friendly' sentences for a category.

So basically to add your criteria, simply add a tuple ( 'telephony_services_abuse', gather_telephony_services_abuse(x) ) to the dataset.

2. Create the associated python file

Go into androwarn/search/malicious_behaviours and create a new python file, for instance named telephony_services.py.
Feel free to copy another's detection criteria file header for imports and logguer declaration.

3 Reference your new category in the global imports

Open the file androwarn/search/search.py and add this line to import your new telephony_services.py file:

from androwarn.search.malicious_behaviours.telephony_services import *

4. Code your detection criteria

You need to code your analysis functions and the gather one inside the category's python file located in the androwarn/search/malicious_behaviours directory (telephony_services.py for this example):

  • detect_Telephony_SMS_abuse(x) : your first criteria
def detect_Telephony_SMS_abuse(x) :
    """
        @param x : a VMAnalysis instance
        
        @rtype : a list of formatted strings
    """
    formatted_str = []
    
    structural_analysis_results = x.tainted_packages.search_methods("Landroid/telephony/SmsManager","sendTextMessage", ".")
    
    for result in xrange(len(structural_analysis_results)) :
        registers = data_flow_analysis(structural_analysis_results, result, x)      
        
        if len(registers) > 3 :
            target_phone_number = get_register_value(1, registers)
            sms_message         = get_register_value(3, registers)
            
            local_formatted_str = "This application sends an SMS message '%s' to the '%s' phone number" % (sms_message, target_phone_number)
            if not(local_formatted_str in formatted_str) :
                formatted_str.append(local_formatted_str)
    return formatted_str
  • detect_Telephony_Phone_Call_abuse(x) : your second criteria
def detect_Telephony_Phone_Call_abuse(x) :
    """
        @param x : a VMAnalysis instance
        
        @rtype : a list of formatted strings
    """
    formatted_str = []
    
    detector_1 = search_string(x, "android.intent.action.CALL")
    detector_2 = search_string(x, "android.intent.action.DIAL")
        
    detectors = [detector_1, detector_2]
    
    if detector_tab_is_not_empty(detectors) :
        local_formatted_str = 'This application makes phone calls'
        formatted_str.append(local_formatted_str)
        
        for res in detectors :
            if res :
                try :
                    log_result_path_information(res, "Call Intent", "string")
                except :
                    log.warn("Detector result '%s' is not a PathVariable instance" % res)
        
    return formatted_str
  • gather_telephony_services_abuse(x): a function whose aim is to aggregate all the result sentences for a category, so in our case the results from the 2 functions above, remember this is the function called in the androwarn/androwarn/analysis/analysis.py dataset
def gather_telephony_services_abuse(x) :
    """
        @param x : a VMAnalysis instance
    
        @rtype : a list strings for the concerned category, for exemple [ 'This application makes phone calls', "This application sends an SMS message 'Premium SMS' to the '12345' phone number" ]
    """
    result = []
    
    result.extend( detect_Telephony_Phone_Call_abuse(x) )
    result.extend( detect_Telephony_SMS_abuse(x) )
    
    return result

5. Weight your criteria according to a user level of verbosity

After creating and developing your own criteria you should give it a weight, in order to respect the user choice of verbosity.
The current predefined levels are :

Essential (-v 1) for newbies
Advanced (-v 2)
Expert (-v 3)

In order to do it, open the androwarn/report/report.py file and take a look at the data_level structure :

data_level  = {
                    # Application
                     'application_name'                     : 1 ,
                     'application_version'                  : 1 ,
                     'package_name'                         : 1 ,
                     'description'                          : 1 ,
                    
                    
                    # Malicious Behaviours Detection
                    # -- Telephony identifiers leakage              
                     'telephony_identifiers_leakage'        : 1 ,
                    
                    # -- Device settings harvesting             
                     'device_settings_harvesting'           : 1 ,
                    
                    # -- Physical location lookup
                     'location_lookup'                      : 1 ,

                    # -- Connection interfaces information exfiltration
                     'connection_interfaces_exfiltration'   : 1 ,

                    # -- Audio/Video eavesdropping
                     'audio_video_eavesdropping'            : 1 ,
                    
                    # -- Suspicious connection establishment
                     'suspicious_connection_establishment'  : 1 ,

                    # -- PIM dataleakage
                     'PIM_data_leakage'                     : 1 ,
                    
                    # -- Native code execution
                     'code_execution'                       : 1 ,
                    
                    # APK 
                     'apk_file_name'                        : 1 ,
                     'SHA-1_hash'                           : 1 ,
                     'file_list'                            : 2 ,
                     'certificate_information'              : 2 ,
                    
                    
                    # Manifest
                     'main_activity'                        : 3 ,
                     'activities'                           : 3 ,
                     'services'                             : 3 ,
                     'receivers'                            : 3 ,
                     'providers'                            : 3 ,
                     'permissions'                          : 1 ,
                     'features'                             : 2 ,
                     'libraries'                            : 2 ,
                    
                    
                    # APIs
                     'classes_list'                         : 3 ,
                     'internal_classes_list'                : 3 ,
                     'external_classes_list'                : 3 ,
                     'internal_packages_list'               : 3 ,
                     'external_packages_list'               : 3 
    }

Simply add this dictionary entry 'telephony_services_abuse': 1 in the # Malicious Behaviours Detection zone if you want to print out your criteria's results for a verbosity >= 1.

That's all ! If you have not missed anything, your new criteria should now be included in the solution and yield some results in the report.