parser
This module offers a generic date/time string parser which is able to parse most known formats to represent a date and/or time.
This module attempts to be forgiving with regards to unlikely input formats, returning a datetime object even for dates which are ambiguous. If an element of a date/time stamp is omitted, the following rules are applied:
If AM or PM is left unspecified, a 24-hour clock is assumed, however, an hour on a 12-hour clock (
0 <= hour <= 12
) must be specified if AM or PM is specified.If a time zone is omitted, a timezone-naive datetime is returned.
If any other elements are missing, they are taken from the
datetime.datetime
object passed to the parameter default
. If this
results in a day number exceeding the valid number of days per month, the
value falls back to the end of the month.
Additional resources about date/time string formats can be found below:
Functions
- parser.parse(parserinfo=None, **kwargs)[source]
Parse a string in one of the supported formats, using the
parserinfo
parameters.- Parameters:
timestr – A string containing a date/time stamp.
parserinfo – A
parserinfo
object containing parameters for the parser. IfNone
, the default arguments to theparserinfo
constructor are used.
The
**kwargs
parameter takes the following keyword arguments:- Parameters:
default – The default datetime object, if this is a datetime object and not
None
, elements specified intimestr
replace elements in the default object.ignoretz – If set
True
, time zones in parsed strings are ignored and a naivedatetime
object is returned.tzinfos –
Additional time zone names / aliases which may be present in the string. This argument maps time zone names (and optionally offsets from those time zones) to time zones. This parameter can be a dictionary with timezone aliases mapping time zone names to time zones or a function taking two parameters (
tzname
andtzoffset
) and returning a time zone.The timezones to which the names are mapped can be an integer offset from UTC in seconds or a
tzinfo
object.>>> from dateutil.parser import parse >>> from dateutil.tz import gettz >>> tzinfos = {"BRST": -7200, "CST": gettz("America/Chicago")} >>> parse("2012-01-19 17:21:00 BRST", tzinfos=tzinfos) datetime.datetime(2012, 1, 19, 17, 21, tzinfo=tzoffset(u'BRST', -7200)) >>> parse("2012-01-19 17:21:00 CST", tzinfos=tzinfos) datetime.datetime(2012, 1, 19, 17, 21, tzinfo=tzfile('/usr/share/zoneinfo/America/Chicago'))
This parameter is ignored if
ignoretz
is set.dayfirst – Whether to interpret the first value in an ambiguous 3-integer date (e.g. 01/05/09) as the day (
True
) or month (False
). Ifyearfirst
is set toTrue
, this distinguishes between YDM and YMD. If set toNone
, this value is retrieved from the currentparserinfo
object (which itself defaults toFalse
).yearfirst – Whether to interpret the first value in an ambiguous 3-integer date (e.g. 01/05/09) as the year. If
True
, the first number is taken to be the year, otherwise the last number is taken to be the year. If this is set toNone
, the value is retrieved from the currentparserinfo
object (which itself defaults toFalse
).fuzzy – Whether to allow fuzzy parsing, allowing for string like “Today is January 1, 2047 at 8:21:00AM”.
fuzzy_with_tokens –
If
True
,fuzzy
is automatically set to True, and the parser will return a tuple where the first element is the parseddatetime.datetime
datetimestamp and the second element is a tuple containing the portions of the string which were ignored:>>> from dateutil.parser import parse >>> parse("Today is January 1, 2047 at 8:21:00AM", fuzzy_with_tokens=True) (datetime.datetime(2047, 1, 1, 8, 21), (u'Today is ', u' ', u'at '))
- Returns:
Returns a
datetime.datetime
object or, if thefuzzy_with_tokens
option isTrue
, returns a tuple, the first element being adatetime.datetime
object, the second a tuple containing the fuzzy tokens.- Raises:
ParserError – Raised for invalid or unknown string formats, if the provided
tzinfo
is not in a valid format, or if an invalid date would be created.OverflowError – Raised if the parsed date exceeds the largest valid C integer on your system.
- parser.isoparse(dt_str)
Parse an ISO-8601 datetime string into a
datetime.datetime
.An ISO-8601 datetime string consists of a date portion, followed optionally by a time portion - the date and time portions are separated by a single character separator, which is
T
in the official standard. Incomplete date formats (such asYYYY-MM
) may not be combined with a time portion.Supported date formats are:
Common:
YYYY
YYYY-MM
YYYY-MM-DD
orYYYYMMDD
Uncommon:
YYYY-Www
orYYYYWww
- ISO week (day defaults to 0)YYYY-Www-D
orYYYYWwwD
- ISO week and day
The ISO week and day numbering follows the same logic as
datetime.date.isocalendar()
.Supported time formats are:
hh
hh:mm
orhhmm
hh:mm:ss
orhhmmss
hh:mm:ss.ssssss
(Up to 6 sub-second digits)
Midnight is a special case for hh, as the standard supports both 00:00 and 24:00 as a representation. The decimal separator can be either a dot or a comma.
Caution
Support for fractional components other than seconds is part of the ISO-8601 standard, but is not currently implemented in this parser.
Supported time zone offset formats are:
Z (UTC)
±HH:MM
±HHMM
±HH
Offsets will be represented as
dateutil.tz.tzoffset
objects, with the exception of UTC, which will be represented asdateutil.tz.tzutc
. Time zone offsets equivalent to UTC (such as +00:00) will also be represented asdateutil.tz.tzutc
.- Parameters:
dt_str – A string or stream containing only an ISO-8601 datetime string
- Returns:
Returns a
datetime.datetime
representing the string. Unspecified components default to their lowest value.
Warning
As of version 2.7.0, the strictness of the parser should not be considered a stable part of the contract. Any valid ISO-8601 string that parses correctly with the default settings will continue to parse correctly in future versions, but invalid strings that currently fail (e.g.
2017-01-01T00:00+00:00:00
) are not guaranteed to continue failing in future versions if they encode a valid date.Added in version 2.7.0.
Classes
- class dateutil.parser.parserinfo(dayfirst=False, yearfirst=False)[source]
Class which handles what inputs are accepted. Subclass this to customize the language and acceptable values for each parameter.
- Parameters:
dayfirst – Whether to interpret the first value in an ambiguous 3-integer date (e.g. 01/05/09) as the day (
True
) or month (False
). Ifyearfirst
is set toTrue
, this distinguishes between YDM and YMD. Default isFalse
.yearfirst – Whether to interpret the first value in an ambiguous 3-integer date (e.g. 01/05/09) as the year. If
True
, the first number is taken to be the year, otherwise the last number is taken to be the year. Default isFalse
.
- AMPM = [('am', 'a'), ('pm', 'p')]
- HMS = [('h', 'hour', 'hours'), ('m', 'minute', 'minutes'), ('s', 'second', 'seconds')]
- JUMP = [' ', '.', ',', ';', '-', '/', "'", 'at', 'on', 'and', 'ad', 'm', 't', 'of', 'st', 'nd', 'rd', 'th']
- MONTHS = [('Jan', 'January'), ('Feb', 'February'), ('Mar', 'March'), ('Apr', 'April'), ('May', 'May'), ('Jun', 'June'), ('Jul', 'July'), ('Aug', 'August'), ('Sep', 'Sept', 'September'), ('Oct', 'October'), ('Nov', 'November'), ('Dec', 'December')]
- PERTAIN = ['of']
- TZOFFSET = {}
- UTCZONE = ['UTC', 'GMT', 'Z', 'z']
- WEEKDAYS = [('Mon', 'Monday'), ('Tue', 'Tuesday'), ('Wed', 'Wednesday'), ('Thu', 'Thursday'), ('Fri', 'Friday'), ('Sat', 'Saturday'), ('Sun', 'Sunday')]