Multi Objects Serialization and De serialization using Python with SDN

1. Introduction

The advent of enterprise and service provider applications towards cloud and virtual computing environments has created overpowering impact on the underlying networks. Today numerous applications and resources are made available to the end users on any type of device with data integration from other applications. Consequence of this, it produces high traffic and also change in the traffic patterns in the network. In order to address these problems and to support elastic nature of cloud offerings, network virtualization plays a vital role. This requires  network elements to be configured automatically thereby demanding for intensive communication.

Brief explanation about Software Defined Network( SDN) or network virtualization

Network virtualization separates the control and data plan activates of the routers and switches and a logically centralized software program controls the behavior of the entire network. Routing calculations are to be done by the controller and the flow table needs to be pushed to the routers or switches for data forwarding. This demands the codification of the entire network functionality involving many network functions like security, prioritization and resource control in terms of low level device configuration and configuring the physical devices automatically. This would require a good serialization and deserialization technique to maintain the integrity of flow tables, configuration files and the right sockets to be created.

In this article,  we discuss  serialization and deserialization techniques using Python objects with   socket creation to enable communication between controller(server) and physical devices(client).   Some examples  of python modules are , Yaml, JSON, marshal, pickle, camel for wrapping and unwrapping the data between the client and the server involved in the communication. It is expected that the reader should have basic knowledge on Python file handling.

2. Importance/Intricacies of Serialization

Serialization is a process used to translate the state of an object or data structure into a format that can be stored in a file or buffered in memory or transmitted across a network.  At the time of de-serialization, need to create an identical clone of object. Serialization and deseirialization technique  converts objects into sequence of binary data and vice versa. Marshalled-unmarshalled, wrapped-unwrapped, pickled -un-pickled  are different terminologies used for serialization-deserialization.

Let us brief the purpose of serialization with an example of video game application: where the state of the game is to be resumed back from where it was left off or  consider other situations like
a) any  webservice, where the server providing huge xml or
b) any other formatted documents to the client and vice versa or
c) a configuration script to be running on any computing or storage or networking device remotely.
In all the above mentioned situations, object serialization is a necessity at the transmitting end and the same data with integrity need to be deserialized at the receiving end.

As a concept of data persistence, serialization allows the sequences of binary data to be stored in the form of binary files and the offset program state.  In the context of communication across the network in the distributed system, these binary files  which contain object sequence data  can be sent through  a TCP connection.

3. Implementation of Multi Objects Serialization

To obtain the binary files containing the object sequence, we have used Python modules and the choice of module is based on various parameters. Some of the parameters are  data type to be transmitted, files to be in human readable format, files to be editable,  number of messages to be transmitted per second, time required to do serialization and  serialized output  file size.

There are different object types such as strings, lists, dictionaries, files and may other complex data types that  can be serialized. We  use switcher code to give an option to the user to choose the required module from the given list. Switcher implementation  is    shown  in Figure 1. The snapshot of the code provides a choice of the module among pickle, json, yaml and  marshal depending on key of the data_type as 1, 2, 3 or 4. For example, if the meth_3 is chosen, then yaml module is to be used for serialization of the object.

Figure 1. Python script for choosing modules

Once a specific module is chosen for the transmission of data, in order to provide the connection between the client and server, socket module of python  is used. Socket is a module which provides  methods to establish communication between client and server through TCP  protocol.  In the sample code of socket server implementation is  shown in Figure 2.  The server is open on port number 50020(any ephemeral port number can be used) to bind and listen from various clients. The buffer size to receive the data is set to 5000 bytes. If a remote server is used then IP address of that server has to be  provided instead of localhost in line number 7.

The listed  modules  yaml, marshal, json, pickle mentioned in Figure 1  have dump or dumps method  to  achieve serialization and load or loads method for  deserialization. The explanation for these methods are given below:

1. dump(object, file) method returns the  serialized representation of the object as a file object.
2. dumps(object) method returns the  serialized representation of the object as a  string.
3. load(file) method reads  a string from the open file  and interprets or parses it as  a  data stream, reconstructs and returns  the original object as a hierarchical clone.
4. loads(string) method reads and reconstructs object hierarchy from a  string.

Based on the data_type, if meth_3 is chosen, the respective yamlclient code  used for serialization-deserialization   is as shown in Figure 3.

If the physical devices(clients) are to be configured from the controller, configurations files are the objects to be serialized and deserailized. The configuration files after deseirlizaiton have to  be   in human readable and  editable format  to allow any further changes to be made.  In such scenarios, Yaml module is a  better choice. It supports simple or complex data set along with the timestamp type to be serialized and deserialized.

Figure 3. Python script for client socket with yaml

4. Tracking Python Script Execution

To start with, the server is up and running and waiting for the client requests. Depending on the data_type to be serialized, swicher code chooses any one of the  methods. In this implementation we have chosen dictionary with complex data type, so
yaml client is executed.

Results tracked for the communication  from yaml client to server is shown in figure 4. The first window in the figure shows the server up and running. The second window shows  the messages communicated between client and server as transmitted and 
Received  data . The data transmitted and received remain exactly the same, which means, after  deserialiation, original data is recoverd.

Figure 4. Result of  transmitted and received objects through yaml client

In summary, we have discussed different modules used for object serialization and deserialization in Python. Criteria to choose the right module for configuration file transmissions from the controller to the physical device such as  router for network   virtualization is dicussed. From the SDN perspective to communicate  flow tables  from  controller to the physical  device   and to automatically configure them, yaml module can be used.

About the Authors

Sheela Ganesh works for Talent Transformation, Wipro Technologies, Bangalore. She has more than 13 years of experience in IT field.  Her core expertise is in Telecommunication and Networking. Sheela can be reached at sheela.ganesh@wipro.com.

Dr B. Thangaraju is an open source software evangelist, who works at Talent Transformation, Wipro Technologies, Bangalore.








}