API
This section of the documentation covers all of the public interfaces of
python-multipart.
Helper Functions
-
multipart.multipart.parse_form(headers, input_stream, on_field, on_file, chunk_size=1048576, **kwargs)
This function is useful if you just want to parse a request body,
without too much work. Pass it a dictionary-like object of the request’s
headers, and a file-like object for the input stream, along with two
callbacks that will get called whenever a field or file is parsed.
Parameters: |
- headers – A dictionary-like object of HTTP headers. The only
required header is Content-Type.
- input_stream – A file-like object that represents the request body.
The read() method must return bytestrings.
- on_field – Callback to call with each parsed field.
- on_file – Callback to call with each parsed file.
- chunk_size – The maximum size to read from the input stream and write
to the parser at one time. Defaults to 1 MiB.
|
-
multipart.multipart.create_form_parser(headers, on_field, on_file, trust_x_headers=False, config={})
This function is a helper function to aid in creating a FormParser
instances. Given a dictionary-like headers object, it will determine
the correct information needed, instantiate a FormParser with the
appropriate values and given callbacks, and then return the corresponding
parser.
Parameters: |
- headers – A dictionary-like object of HTTP headers. The only
required header is Content-Type.
- on_field – Callback to call with each parsed field.
- on_file – Callback to call with each parsed file.
- trust_x_headers – Whether or not to trust information received from
certain X-Headers - for example, the file name from
X-File-Name.
- config – Configuration variables to pass to the FormParser.
|
Main Class
-
class multipart.multipart.FormParser(content_type, on_field, on_file, on_end=None, boundary=None, file_name=None, FileClass=<class 'multipart.multipart.File'>, FieldClass=<class 'multipart.multipart.Field'>, config={})
This class is the all-in-one form parser. Given all the information
necessary to parse a form, it will instantiate the correct parser, create
the proper Field and File classes to store the data that
is parsed, and call the two given callbacks with each field and file as
they become available.
Parameters: |
- content_type – The Content-Type of the incoming request. This is
used to select the appropriate parser.
- on_field – The callback to call when a field has been parsed and is
ready for usage. See above for parameters.
- on_file – The callback to call when a file has been parsed and is
ready for usage. See above for parameters.
- on_end – An optional callback to call when all fields and files in a
request has been parsed. Can be None.
- boundary – If the request is a multipart/form-data request, this
should be the boundary of the request, as given in the
Content-Type header, as a bytestring.
- file_name – If the request is of type application/octet-stream, then
the body of the request will not contain any information
about the uploaded file. In such cases, you can provide
the file name of the uploaded file manually.
- FileClass –
The class to use for uploaded files. Defaults to
File, but you can provide your own class if you
wish to customize behaviour. The class will be
instantiated as FileClass(file_name, field_name), and it
must provide the folllowing functions:
file_instance.write(data)
file_instance.finalize()
file_instance.close()
- FieldClass –
The class to use for uploaded fields. Defaults to
Field, but you can provide your own class if
you wish to customize behaviour. The class will be
instantiated as FieldClass(field_name), and it must
provide the folllowing functions:
field_instance.write(data)
field_instance.finalize()
field_instance.close()
- config – Configuration to use for this FormParser. The default
values are taken from the DEFAULT_CONFIG value, and then
any keys present in this dictionary will overwrite the
default values.
|
-
DEFAULT_CONFIG = {'UPLOAD_DIR': None, 'UPLOAD_KEEP_FILENAME': False, 'MAX_BODY_SIZE': inf, 'MAX_MEMORY_FILE_SIZE': 1048576, 'UPLOAD_ERROR_ON_BAD_CTE': False, 'UPLOAD_KEEP_EXTENSIONS': False}
This is the default configuration for our form parser.
Note: all file sizes should be in bytes.
-
close()
Close the parser.
-
finalize()
Finalize the parser.
-
write(data)
Write some data. The parser will forward this to the appropriate
underlying parser.
Parameters: | data – a bytestring |
Parsers
-
class multipart.multipart.BaseParser
This class is the base class for all parsers. It contains the logic for
calling and adding callbacks.
A callback can be one of two different forms. “Notification callbacks” are
callbacks that are called when something happens - for example, when a new
part of a multipart message is encountered by the parser. “Data callbacks”
are called when we get some sort of data - for example, part of the body of
a multipart chunk. Notification callbacks are called with no parameters,
whereas data callbacks are called with three, as follows:
data_callback(data, start, end)
The “data” parameter is a bytestring (i.e. “foo” on Python 2, or b”foo” on
Python 3). “start” and “end” are integer indexes into the “data” string
that represent the data of interest. Thus, in a data callback, the slice
data[start:end] represents the data that the callback is “interested in”.
The callback is not passed a copy of the data, since copying severely hurts
performance.
-
callback(name, data=None, start=None, end=None)
This function calls a provided callback with some data. If the
callback is not set, will do nothing.
Parameters: |
- name – The name of the callback to call (as a string).
- data – Data to pass to the callback. If None, then it is
assumed that the callback is a notification callback,
and no parameters are given.
- end – An integer that is passed to the data callback.
- start – An integer that is passed to the data callback.
|
-
set_callback(name, new_func)
Update the function for a callback. Removes from the callbacks dict
if new_func is None.
Parameters: |
- name – The name of the callback to call (as a string).
- new_func – The new function for the callback. If None, then the
callback will be removed (with no error if it does not
exist).
|
-
class multipart.multipart.OctetStreamParser(callbacks={}, max_size=inf)
This parser parses an octet-stream request body and calls callbacks when
incoming data is received. Callbacks are as follows:
Callback Name |
Parameters |
Description |
on_start |
None |
Called when the first data is parsed. |
on_data |
data, start, end |
Called for each data chunk that is parsed. |
on_end |
None |
Called when the parser is finished parsing all data. |
Parameters: |
- callbacks – A dictionary of callbacks. See the documentation for
BaseParser.
- max_size – The maximum size of body to parse. Defaults to infinity -
i.e. unbounded.
|
-
finalize()
Finalize this parser, which signals to that we are finished parsing,
and sends the on_end callback.
-
write(data)
Write some data to the parser, which will perform size verification,
and then pass the data to the underlying callback.
Parameters: | data – a bytestring |
-
class multipart.multipart.QuerystringParser(callbacks={}, strict_parsing=False, max_size=inf)
This is a streaming querystring parser. It will consume data, and call
the callbacks given when it has data.
Callback Name |
Parameters |
Description |
on_field_start |
None |
Called when a new field is encountered. |
on_field_name |
data, start, end |
Called when a portion of a field’s name is encountered. |
on_field_data |
data, start, end |
Called when a portion of a field’s data is encountered. |
on_field_end |
None |
Called when the end of a field is encountered. |
on_end |
None |
Called when the parser is finished parsing all data. |
Parameters: |
- callbacks – A dictionary of callbacks. See the documentation for
BaseParser.
- strict_parsing – Whether or not to parse the body strictly. Defaults
to False. If this is set to True, then the behavior
of the parser changes as the following: if a field
has a value with an equal sign (e.g. “foo=bar”, or
“foo=”), it is always included. If a field has no
equals sign (e.g. ”...&name&...”), it will be
treated as an error if ‘strict_parsing’ is True,
otherwise included. If an error is encountered,
then a
multipart.exceptions.QuerystringParseError
will be raised.
- max_size – The maximum size of body to parse. Defaults to infinity -
i.e. unbounded.
|
-
finalize()
Finalize this parser, which signals to that we are finished parsing,
if we’re still in the middle of a field, an on_field_end callback, and
then the on_end callback.
-
write(data)
Write some data to the parser, which will perform size verification,
parse into either a field name or value, and then pass the
corresponding data to the underlying callback. If an error is
encountered while parsing, a QuerystringParseError will be raised. The
“offset” attribute of the raised exception will be set to the offset in
the input data chunk (NOT the overall stream) that caused the error.
Parameters: | data – a bytestring |
-
class multipart.multipart.MultipartParser(boundary, callbacks={}, max_size=inf)
This class is a streaming multipart/form-data parser.
Callback Name |
Parameters |
Description |
on_part_begin |
None |
Called when a new part of the multipart message is encountered. |
on_part_data |
data, start, end |
Called when a portion of a part’s data is encountered. |
on_part_end |
None |
Called when the end of a part is reached. |
on_header_begin |
None |
Called when we’ve found a new header in a part of a multipart
message |
on_header_field |
data, start, end |
Called each time an additional portion of a header is read (i.e. the
part of the header that is before the colon; the “Foo” in
“Foo: Bar”). |
on_header_value |
data, start, end |
Called when we get data for a header. |
on_header_end |
None |
Called when the current header is finished - i.e. we’ve reached the
newline at the end of the header. |
on_headers_finished |
None |
Called when all headers are finished, and before the part data
starts. |
on_end |
None |
Called when the parser is finished parsing all data. |
Parameters: |
- boundary – The multipart boundary. This is required, and must match
what is given in the HTTP request - usually in the
Content-Type header.
- callbacks – A dictionary of callbacks. See the documentation for
BaseParser.
- max_size – The maximum size of body to parse. Defaults to infinity -
i.e. unbounded.
|
-
finalize()
Finalize this parser, which signals to that we are finished parsing.
Note: It does not currently, but in the future, it will verify that we
are in the final state of the parser (i.e. the end of the multipart
message is well-formed), and, if not, throw an error.
-
write(data)
Write some data to the parser, which will perform size verification,
and then parse the data into the appropriate location (e.g. header,
data, etc.), and pass this on to the underlying callback. If an error
is encountered, a MultipartParseError will be raised. The “offset”
attribute on the raised exception will be set to the offset of the byte
in the input chunk that caused the error.
Parameters: | data – a bytestring |
Support Classes
-
class multipart.multipart.Field(name)
A Field object represents a (parsed) form field. It represents a single
field with a corresponding name and value.
The name that a Field will be instantiated with is the same name
that would be found in the following HTML:
<input name="name_goes_here" type="text"/>
This class defines two methods, on_data() and on_end(), that
will be called when data is written to the Field, and when the Field is
finalized, respectively.
Parameters: | name – the name of the form field |
-
close()
Close the Field object. This will free any underlying cache.
-
field_name
This property returns the name of the field.
-
finalize()
Finalize the form field.
-
classmethod from_value(klass, name, value)
Create an instance of a Field, and set the corresponding
value - either None or an actual value. This method will also
finalize the Field itself.
Parameters: |
- name – the name of the form field
- value – the value of the form field - either a bytestring or
None
|
-
on_data(data)
This method is a callback that will be called whenever data is
written to the Field.
Parameters: | data – a bytestring |
-
on_end()
This method is called whenever the Field is finalized.
-
set_none()
Some fields in a querystring can possibly have a value of None - for
example, the string “foo&bar=&baz=asdf” will have a field with the
name “foo” and value None, one with name “bar” and value “”, and one
with name “baz” and value “asdf”. Since the write() interface doesn’t
support writing None, this function will set the field value to None.
-
value
This property returns the value of the form field.
-
write(data)
Write some data into the form field.
Parameters: | data – a bytestring |
-
class multipart.multipart.File(file_name, field_name=None, config={})
This class represents an uploaded file. It handles writing file data to
either an in-memory file or a temporary file on-disk, if the optional
threshold is passed.
There are some options that can be passed to the File to change behavior
of the class. Valid options are as follows:
Name |
Type |
Default |
Description |
UPLOAD_DIR |
str |
None |
The directory to store uploaded files in. If this is None, a
temporary file will be created in the system’s standard location. |
UPLOAD_KEEP_FILENAME |
bool |
False |
Whether or not to keep the filename of the uploaded file. If True,
then the filename will be converted to a safe representation (e.g.
by removing any invalid path segments), and then saved with the
same name). Otherwise, a temporary name will be used. |
UPLOAD_KEEP_EXTENSIONS |
bool |
False |
Whether or not to keep the uploaded file’s extension. If False, the
file will be saved with the default temporary extension (usually
”.tmp”). Otherwise, the file’s extension will be maintained. Note
that this will properly combine with the UPLOAD_KEEP_FILENAME
setting. |
MAX_MEMORY_FILE_SIZE |
int |
1 MiB |
The maximum number of bytes of a File to keep in memory. By
default, the contents of a File are kept into memory until a certain
limit is reached, after which the contents of the File are written
to a temporary file. This behavior can be disabled by setting this
value to an appropriately large value (or, for example, infinity,
such as float(‘inf’). |
Parameters: |
- file_name – The name of the file that this File represents
- field_name – The field name that uploaded this file. Note that this
can be None, if, for example, the file was uploaded
with Content-Type application/octet-stream
- config – The configuration for this File. See above for valid
configuration keys and their corresponding values.
|
-
actual_file_name
The file name that this file is saved as. Will be None if it’s not
currently saved on disk.
-
close()
Close the File object. This will actually close the underlying
file object (whether it’s a io.BytesIO or an actual file
object).
-
field_name
The form field associated with this file. May be None if there isn’t
one, for example when we have an application/octet-stream upload.
-
file_name
The file name given in the upload request.
-
file_object
The file object that we’re currently writing to. Note that this
will either be an instance of a io.BytesIO, or a regular file
object.
-
finalize()
Finalize the form file. This will not close the underlying file,
but simply signal that we are finished writing to the File.
-
flush_to_disk()
If the file is already on-disk, do nothing. Otherwise, copy from
the in-memory buffer to a disk file, and then reassign our internal
file object to this new disk file.
Note that if you attempt to flush a file that is already on-disk, a
warning will be logged to this module’s logger.
-
in_memory
A boolean representing whether or not this file object is currently
stored in-memory or on-disk.
-
on_data(data)
This method is a callback that will be called whenever data is
written to the File.
Parameters: | data – a bytestring |
-
on_end()
This method is called whenever the Field is finalized.
-
size
The total size of this file, counted as the number of bytes that
currently have been written to the file.
-
write(data)
Write some data to the File.
Parameters: | data – a bytestring |
Decoders
-
class multipart.decoders.Base64Decoder(underlying)
This object provides an interface to decode a stream of Base64 data. It
is instantiated with an “underlying object”, and whenever a write()
operation is performed, it will decode the incoming data as Base64, and
call write() on the underlying object. This is primarily used for decoding
form data encoded as Base64, but can be used for other purposes:
from multipart.decoders import Base64Decoder
fd = open("notb64.txt", "wb")
decoder = Base64Decoder(fd)
try:
decoder.write("Zm9vYmFy") # "foobar" in Base64
decoder.finalize()
finally:
decoder.close()
# The contents of "notb64.txt" should be "foobar".
This object will also pass all finalize() and close() calls to the
underlying object, if the underlying object supports them.
Note that this class maintains a cache of base64 chunks, so that a write of
arbitrary size can be performed. You must call finalize() on this
object after all writes are completed to ensure that all data is flushed
to the underlying object.
Parameters: | underlying – the underlying object to pass writes to |
-
close()
Close this decoder. If the underlying object has a close()
method, this function will call it.
-
finalize()
Finalize this object. This should be called when no more data
should be written to the stream. This function can raise a
multipart.exceptions.DecodeError if there is some remaining
data in the cache.
If the underlying object has a finalize() method, this function will
call it.
-
write(data)
Takes any input data provided, decodes it as base64, and passes it
on to the underlying object. If the data provided is invalid base64
data, then this method will raise
a multipart.exceptions.DecodeError
Parameters: | data – base64 data to decode |
-
class multipart.decoders.QuotedPrintableDecoder(underlying)
This object provides an interface to decode a stream of quoted-printable
data. It is instantiated with an “underlying object”, in the same manner
as the multipart.decoders.Base64Decoder class. This class behaves
in exactly the same way, including maintaining a cache of quoted-printable
chunks.
Parameters: | underlying – the underlying object to pass writes to |
-
close()
Close this decoder. If the underlying object has a close()
method, this function will call it.
-
finalize()
Finalize this object. This should be called when no more data
should be written to the stream. This function will not raise any
exceptions, but it may write more data to the underlying object if
there is data remaining in the cache.
If the underlying object has a finalize() method, this function will
call it.
-
write(data)
Takes any input data provided, decodes it as quoted-printable, and
passes it on to the underlying object.
Parameters: | data – quoted-printable data to decode |
Exceptions
The following are all custom exceptions that python-multipart will raise, for various cases. Each method that will raise an exception will document it in this documentation.
-
exception multipart.exceptions.DecodeError
This exception is raised when there is a decoding error - for example
with the Base64Decoder or QuotedPrintableDecoder.
-
exception multipart.exceptions.FileError
Exception class for problems with the File class.
-
exception multipart.exceptions.FormParserError
Base error class for our form parser.
-
exception multipart.exceptions.MultipartParseError
This is a specific error that is raised when the MultipartParser detects
an error while parsing.
-
exception multipart.exceptions.ParseError
This exception (or a subclass) is raised when there is an error while
parsing something.
-
offset = -1
This is the offset in the input data chunk (NOT the overall stream) in
which the parse error occured. It will be -1 if not specified.
-
exception multipart.exceptions.QuerystringParseError
This is a specific error that is raised when the QuerystringParser
detects an error while parsing.