Multi Objects Serialization and De serialization using Python with SDN
Posted On April 13, 2016 by Sneha Latha filed under
The advent of enterprise and service provider applications towards cloud and virtual computing environments has created overpowering impact on the underlying networks. Today numerous applications and resources are made available to the end users on any type of device with data integration from other applications. Consequence of this, it produces high traffic and also change in the traffic patterns in the network. In order to address these problems and to support elastic nature of cloud offerings, network virtualization plays a vital role. This requires network elements to be configured automatically thereby demanding for intensive communication.
Brief explanation about Software Defined Network( SDN) or network virtualization
Network virtualization separates the control and data plan activates of the routers and switches and a logically centralized software program controls the behavior of the entire network. Routing calculations are to be done by the controller and the flow table needs to be pushed to the routers or switches for data forwarding. This demands the codification of the entire network functionality involving many network functions like security, prioritization and resource control in terms of low level device configuration and configuring the physical devices automatically. This would require a good serialization and deserialization technique to maintain the integrity of flow tables, configuration files and the right sockets to be created.
In this article, we discuss serialization and deserialization techniques using Python objects with socket creation to enable communication between controller(server) and physical devices(client). Some examples of python modules are , Yaml, JSON, marshal, pickle, camel for wrapping and unwrapping the data between the client and the server involved in the communication. It is expected that the reader should have basic knowledge on Python file handling.
2. Importance/Intricacies of Serialization
Serialization is a process used to translate the state of an object or data structure into a format that can be stored in a file or buffered in memory or transmitted across a network. At the time of de-serialization, need to create an identical clone of object. Serialization and deseirialization technique converts objects into sequence of binary data and vice versa. Marshalled-unmarshalled, wrapped-unwrapped, pickled -un-pickled are different terminologies used for serialization-deserialization.
Let us brief the purpose of serialization with an example of video game application: where the state of the game is to be resumed back from where it was left off or consider other situations like
a) any webservice, where the server providing huge xml or
b) any other formatted documents to the client and vice versa or
c) a configuration script to be running on any computing or storage or networking device remotely.
In all the above mentioned situations, object serialization is a necessity at the transmitting end and the same data with integrity need to be deserialized at the receiving end.
As a concept of data persistence, serialization allows the sequences of binary data to be stored in the form of binary files and the offset program state. In the context of communication across the network in the distributed system, these binary files which contain object sequence data can be sent through a TCP connection.
3. Implementation of Multi Objects Serialization
To obtain the binary files containing the object sequence, we have used Python modules and the choice of module is based on various parameters. Some of the parameters are data type to be transmitted, files to be in human readable format, files to be editable, number of messages to be transmitted per second, time required to do serialization and serialized output file size.
There are different object types such as strings, lists, dictionaries, files and may other complex data types that can be serialized. We use switcher code to give an option to the user to choose the required module from the given list. Switcher implementation is shown in Figure 1. The snapshot of the code provides a choice of the module among pickle, json, yaml and marshal depending on key of the data_type as 1, 2, 3 or 4. For example, if the meth_3 is chosen, then yaml module is to be used for serialization of the object.
Figure 1. Python script for choosing modules
Once a specific module is chosen for the transmission of data, in order to provide the connection between the client and server, socket module of python is used. Socket is a module which provides methods to establish communication between client and server through TCP protocol. In the sample code of socket server implementation is shown in Figure 2. The server is open on port number 50020(any ephemeral port number can be used) to bind and listen from various clients. The buffer size to receive the data is set to 5000 bytes. If a remote server is used then IP address of that server has to be provided instead of localhost in line number 7.
The listed modules yaml, marshal, json, pickle mentioned in Figure 1 have dump or dumps method to achieve serialization and load or loads method for deserialization. The explanation for these methods are given below:
1. dump(object, file) method returns the serialized representation of the object as a file object.
2. dumps(object) method returns the serialized representation of the object as a string.
3. load(file) method reads a string from the open file and interprets or parses it as a data stream, reconstructs and returns the original object as a hierarchical clone.
4. loads(string) method reads and reconstructs object hierarchy from a string.
Based on the data_type, if meth_3 is chosen, the respective yamlclient code used for serialization-deserialization is as shown in Figure 3.
If the physical devices(clients) are to be configured from the controller, configurations files are the objects to be serialized and deserailized. The configuration files after deseirlizaiton have to be in human readable and editable format to allow any further changes to be made. In such scenarios, Yaml module is a better choice. It supports simple or complex data set along with the timestamp type to be serialized and deserialized.
Figure 3. Python script for client socket with yaml
4. Tracking Python Script Execution
To start with, the server is up and running and waiting for the client requests. Depending on the data_type to be serialized, swicher code chooses any one of the methods. In this implementation we have chosen dictionary with complex data type, so
yaml client is executed.
Results tracked for the communication from yaml client to server is shown in figure 4. The first window in the figure shows the server up and running. The second window shows the messages communicated between client and server as transmitted and
Received data . The data transmitted and received remain exactly the same, which means, after deserialiation, original data is recoverd.
Figure 4. Result of transmitted and received objects through yaml client
In summary, we have discussed different modules used for object serialization and deserialization in Python. Criteria to choose the right module for configuration file transmissions from the controller to the physical device such as router for network virtualization is dicussed. From the SDN perspective to communicate flow tables from controller to the physical device and to automatically configure them, yaml module can be used.
About the Authors
Sheela Ganesh works for Talent Transformation, Wipro Technologies, Bangalore. She has more than 13 years of experience in IT field. Her core expertise is in Telecommunication and Networking. Sheela can be reached at email@example.com.
Dr B. Thangaraju is an open source software evangelist, who works at Talent Transformation, Wipro Technologies, Bangalore.