Building a Module in Python

In this article author shows you how to create a module in step by step.

Modules provide a convenient way to share Python code between applications. A module is a very simple construct. In Python, a module is merely a file of Python statements. The module might define functions and classes. It can contain simple executable code that’s not inside a function or class. And, best yet, a module might contain documentation about how to use the code in the module.

Python comes with a library of hundreds of modules that you can call in your scripts. You can also create your own modules to share code among your scripts.

The first step is to examine what modules really are and how they work.

Exploring Modules

A module is just a Python source file. The module can contain variables, classes, functions, and any other element available in your Python scripts.

You can get a better understanding of modules by using the dir function. Pass the name of some Python element, such as a module, and dir will tell you all of the attributes of that element. For example, to see the attributes of __builtins__, which contain built-in functions, classes, and variables, use the following:

dir(__builtins__)

For example:

>>> dir(__builtins__)
[‘ArithmeticError’, ‘AssertionError’, ‘AttributeError’, ‘DeprecationWarning’,
‘EOFError’, ‘Ellipsis’, ‘EnvironmentError’, ‘Exception’, ‘False’,
‘FloatingPointError’, ‘FutureWarning’, ‘IOError’, ‘ImportError’,
‘IndentationError’, ‘IndexError’, ‘KeyError’, ‘KeyboardInterrupt’,
‘LookupError’, ‘MemoryError’, ‘NameError’, ‘None’, ‘NotImplemented’,
‘NotImplementedError’, ‘OSError’, ‘OverflowError’, ‘OverflowWarning’,
‘PendingDeprecationWarning’, ‘ReferenceError’, ‘RuntimeError’, ‘RuntimeWarning’,
‘StandardError’, ‘StopIteration’, ‘SyntaxError’, ‘SyntaxWarning’,
‘SystemError’, ‘SystemExit’, ‘TabError’, ‘True’, ‘TypeError’,
‘UnboundLocalError’, ‘UnicodeDecodeError’, ‘UnicodeEncodeError’,
‘UnicodeError’, ‘UnicodeTranslateError’, ‘UserWarning’, ‘ValueError’,
‘Warning’, ‘ZeroDivisionError’, ‘__debug__’, ‘__doc__’, ‘__import__’,
‘__name__’, ‘abs’, ‘apply’, ‘basestring’, ‘bool’, ‘buffer’, ‘callable’, ‘chr’,
‘classmethod’, ‘cmp’, ‘coerce’, ‘compile’, ‘complex’, ‘copyright’, ‘credits’,
‘delattr’, ‘dict’, ‘dir’, ‘divmod’, ‘enumerate’, ‘eval’, ‘execfile’, ‘exit’,
‘file’, ‘filter’, ‘float’, ‘getattr’, ‘globals’, ‘hasattr’, ‘hash’, ‘help’,
‘hex’, ‘id’, ‘input’, ‘int’, ‘intern’, ‘isinstance’, ‘issubclass’, ‘iter’,
‘len’, ‘license’, ‘list’, ‘locals’, ‘long’, ‘map’, ‘max’, ‘min’, ‘object’,
‘oct’, ‘open’, ‘ord’, ‘pow’, ‘property’, ‘quit’, ‘range’, ‘raw_input’,
‘reduce’, ‘reload’, ‘repr’, ‘round’, ‘setattr’, ‘slice’, ‘staticmethod’,
‘str’, ‘sum’, ‘super’, ‘tuple’, ‘type’, ‘unichr’, ‘unicode’, ‘vars’,
‘xrange’, ‘zip’]

The example shown here uses Python 2.3, but the techniques apply to Python 2.4 as well.

For a language with as many features as Python, there are surprisingly few built-in elements. You can run the dir function on modules you import as well. For example:

>>> import sys
>>> dir(sys)
[‘__displayhook__’, ‘__doc__’, ‘__excepthook__’, ‘__name__’,
‘__stderr__’, ‘__stdin__’, ‘__stdout__’, ‘_getframe’, ‘api_version’,
‘argv’, ‘builtin_module_names’, ‘byteorder’, ‘call_tracing’, ‘callstats’,
‘copyright’, ‘displayhook’, ‘exc_clear’, ‘exc_info’, ‘exc_type’, ‘excepthook’,
‘exec_prefix’, ‘executable’, ‘exit’, ‘getcheckinterval’, ‘getdefaultencoding’,
‘getdlopenflags’, ‘getfilesystemencoding’, ‘getrecursionlimit’, ‘getrefcount’,
‘hexversion’, ‘last_traceback’, ‘last_type’, ‘last_value’, ‘maxint’,
‘maxunicode’, ‘meta_path’, ‘modules’, ‘path’, ‘path_hooks’,
‘path_importer_cache’, ‘platform’, ‘prefix’, ‘ps1’, ‘ps2’, ‘setcheckinterval’,
‘setdlopenflags’, ‘setprofile’, ‘setrecursionlimit’, ‘settrace’, ‘stderr’,
‘stdin’, ‘stdout’, ‘version’, ‘version_info’, ‘warnoptions’]

Use dir to help examine modules, including the modules you create.

Importing Modules

Before using a module, you need to import it. The standard syntax for importing follows:

import module

You can use this syntax with modules that come with Python or with modules you create. You can also use the following alternative syntax:

from module import item

The alternative syntax enables you to specifically import just a class or function if that is all you need.

If a module has changed, you can reload the new definition of the module using the reload function. The syntax follows:

reload(module)

Replace module with the module you want to reload.

With reload, always use parentheses. With import, do not use parentheses.

Finding Modules
To import a module, the Python interpreter needs to find the module. With a module, the Python interpreter first looks for a file named module.py, where module is the name of the module you pass to the import statement. On finding a module, the Python interpreter will compile the module into a .pyc file. When you next import the module, the Python interpreter can load the pre-compiled module, speeding your Python scripts.

When you place an import statement in your scripts, the Python interpreter has to be able to find the module. The key point is that the Python interpreter only looks in a certain number of directories for your module. If you enter a name the Python interpreter cannot find, it will display an error, as shown in the following example:

>>> import foo
Traceback (most recent call last):
File “<stdin>”, line 1, in ?
ImportError: No module named foo

The Python interpreter looks in the directories that are part of the module search path. These directories are listed in the sys.path variable from the sys module:

To list where the Python interpreter looks for modules, print out the value of the sys.path variable in the Python interpreter. For example

>>> import sys
>>> print sys.path
[‘’, ‘/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python23.zip’,
‘/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3’,
‘/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/
plat-darwin’,
‘/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/plat-mac’,
‘/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/platmac/
lib-scriptpackages’,
‘/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/lib-tk’,
‘/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/libdynload’,
‘/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/sitepackages’]

Note that one of the directory entries is empty, signifying the current directory.

Digging through Modules
Because Python is an open-source package, you can get the source code to the Python interpreter as well as all modules. In fact, even with a binary distribution of Python, you’ll find the source code for modules written in Python. Start by looking in all the directories listed in the sys.path variable for files with names ending in .py. These are Python modules. Some modules contain functions, and others contain classes and functions. For example, the following module, MimeWriter, defines a class in the Python 2.3 distribution:

“””Generic MIME writer.
This module defines the class MimeWriter. The MimeWriter class implements a basic formatter for creating MIME multi-part files. It doesn’t seek around the output file nor does it use large amounts of buffer space. You must write the parts out in the order that they should occur in the final file. MimeWriter does buffer the headers you add, allowing you to rearrange their
order.

“””
import mimetools
__all__ = [“MimeWriter”]
class MimeWriter:
“””Generic MIME writer.
Methods:
__init__()
addheader()
flushheaders()
startbody()
startmultipartbody()
nextpart()
lastpart()

A MIME writer is much more primitive than a MIME parser. It doesn’t seek around on the output file, and it doesn’t use large amounts of buffer space, so you have to write the parts in the order they should occur on the output file. It does buffer the headers you add, allowing you to rearrange their order. General usage is:

f = <open the output file>
w = MimeWriter(f)
...call w.addheader(key, value) 0 or more times...
followed by either:
f = w.startbody(content_type)
...call f.write(data) for body data...
or:
w.startmultipartbody(subtype)
for each part:
subwriter = w.nextpart()
...use the subwriter’s methods to create the subpart...
w.lastpart()

The subwriter is another MimeWriter instance, and should be treated in the same way as the toplevel MimeWriter. This way, writing recursive body parts is easy.

Warning: don’t forget to call lastpart()! XXX

There should be more state so calls made in the wrong order are detected.

Some special cases:
- startbody() just returns the file passed to the constructor;
but don’t use this knowledge, as it may be changed.

- startmultipartbody() actually returns a file as well; this can be used to write the initial ‘if you can read this your mailer is not MIME-aware’ message.

- If you call flushheaders(), the headers accumulated so far are written out (and forgotten); this is useful if you don’t need a body part at all, e.g. for a subpart of type message/rfc822 that’s (mis)used to store some header-like information.

- Passing a keyword argument ‘prefix=<flag>’ to addheader(), start*body() affects where the header is inserted; 0 means append at the end, 1 means insert at the start; default is
append for addheader(), but insert for start*body(), which use it to determine where the Content-Type header goes.

def __init__(self, fp):
self._fp = fp
self._headers = []

def addheader(self, key, value, prefix=0):
“””Add a header line to the MIME message.

The key is the name of the header, where the value obviously provides the value of the header. The optional argument prefix determines where the header is inserted; 0 means append at the end, 1 means insert at the start. The default is to append.

“””
lines = value.split(“\n”)
while lines and not lines[-1]: del lines[-1]
while lines and not lines[0]: del lines[0]
for i in range(1, len(lines)):
lines[i] = “ “ + lines[i].strip()
value = “\n”.join(lines) + “\n”
line = key + “: “ + value
if prefix:
self._headers.insert(0, line)
else:
self._headers.append(line)

def flushheaders(self):
“””Writes out and forgets all headers accumulated so far.

This is useful if you don’t need a body part at all; for example, for a subpart of type message/rfc822 that’s (mis)used to store some header-like information.

“””
self._fp.writelines(self._headers)
self._headers = []

def startbody(self, ctype, plist=[], prefix=1):
“””Returns a file-like object for writing the body of the message.

The content-type is set to the provided ctype, and the optional parameter, plist, provides additional parameters for the content-type declaration. The optional argument prefix determines where the header is inserted; 0 means append at the end, 1 means insert at the start. The default is to insert at the start.

“””
for name, value in plist:
ctype = ctype + ‘;\n %s=\”%s\”’ % (name, value)
self.addheader(“Content-Type”, ctype, prefix=prefix)
self.flushheaders()

self._fp.write(“\n”)
return self._fp
def startmultipartbody(self, subtype, boundary=None, plist=[], prefix=1):
“””Returns a file-like object for writing the body of the message.

Additionally, this method initializes the multi-part code, where the subtype parameter provides the multipart subtype, the boundary parameter may provide a user-defined boundary specification, and the plist parameter provides optional parameters for the subtype. The optional argument, prefix, determines where the header is inserted;

0 means append at the end, 1 means insert at the start. The default is to insert at the start. Subparts should be created using the nextpart() method.

“””
self._boundary = boundary or mimetools.choose_boundary()
return self.startbody(“multipart/” + subtype,
[(“boundary”, self._boundary)] + plist,
prefix=prefix)

def nextpart(self):
“””Returns a new instance of MimeWriter which represents an
individual part in a multipart message.

This may be used to write the part as well as used for creating recursively complex multipart messages. The message must first be initialized with the startmultipartbody() method before using the nextpart() method.

“””
self._fp.write(“\n--” + self._boundary + “\n”)
return self.__class__(self._fp)
def lastpart(self):

“””This is used to designate the last part of a multipart message. It should always be used when writing multipart messages.

“””
self._fp.write(“\n--” + self._boundary + “--\n”)
if __name__ == ‘__main__’:
import test.test_MimeWriter

The majority of this small module is made up of documentation that instructs users how to use the module. Documentation is important.

When you look through the standard Python modules, you can get a feel for how modules are put together. It also helps when you want to create your own modules.

Creating Modules and Packages

Creating modules is easier than you might think. A module is merely a Python source file. In fact, any time you’ve created a Python file, you have already been creating modules without even knowing it. Use the following example to help you get started creating modules.

Creating a Module with Functions

Enter the following Python code and name the file food.py:

def favoriteFood():
print ‘The only food worth eating is an omelet.’

This is your module. You then can import the module using the Python interpreter. For example:

>>> import food
>>> dir(food)
[‘__builtins__’, ‘__doc__’, ‘__file__’, ‘__name__’, ‘favoriteFood’]

How It Works

Python uses a very simple definition for a module. You can use any Python source file as a module, as shown in this short example. The dir function lists the items defined in the module, including the function favoriteFood.

Once imported, you can execute the code in the module with a command like the following:

>>> food.favoriteFood()
The only food worth eating is an omelet.

If you don’t use the module name prefix, food in this case, you will get an error, as shown in the following example:

>>> favoriteFood()
Traceback (most recent call last):
File “<stdin>”, line 1, in ?
NameError: name ‘favoriteFood’ is not defined

Using the alternative syntax for imports can eliminate this problem:

>>> from food import favoriteFood
>>> favoriteFood()

The only food worth eating is an omelet.
>>>

For more information you can send an email to kdatta23@gmail.com








}