Skip to content

Commit

Permalink
Merge commit 'v0.8.0rc2-26-g76c6351' into debian-0.8
Browse files Browse the repository at this point in the history
* commit 'v0.8.0rc2-26-g76c6351': (42 commits)
  BUG/TST: typo caused read_csv to lose index name pandas-dev#1536
  BUG: incorrect tick label positions pandas-dev#1531 (zooming is still wrong)
  ENH: register converters with matplotlib for better datetime convesion
  ENH: handle datetime.date in Period constructor
  DOC: small doc for pandas-dev#1450
  BUG: repr of pre-1900 datetime64 values in a DataFrame column close pandas-dev#1518
  BUG: workaround vstack/concat bug in numpy 1.6 pandas-dev#1518
  DOC: lreshape docstring, release note
  ENH: experimental lreshape function
  BUG: plotting DataFrame with freq with offset
  BUG: DataFrame plotting with inferred freq
  BUG: timedelta.total_seconds only in 2.7 and 3.2
  DOC: release notes
  ENH: Add raise on conflict keyword to update
  DOC: release notes re: pandas-dev#921
  overload header keyword instead of extra col_aliases keyword
  ENH: column aliases for to_csv/to_excel pandas-dev#921
  ENH: handle weekly resampling via daily
  BUG: plot mixed frequencies pandas-dev#1517
  BUG/TST: plot irregular and reg freq on same subplot
  ...
  • Loading branch information
yarikoptic committed Jun 27, 2012
2 parents 1c08383 + 76c6351 commit 1f2b250
Show file tree
Hide file tree
Showing 43 changed files with 1,091 additions and 199 deletions.
25 changes: 22 additions & 3 deletions RELEASE.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ Where to get it
pandas 0.8.0
============

**Release date:** NOT YET RELEASED
**Release date:** 6/26/2012

**New features**

Expand All @@ -43,7 +43,7 @@ pandas 0.8.0
conversion method (#1018)
- Implement robust frequency inference function and `inferred_freq` attribute
on DatetimeIndex (#391)
- New ``tz_convert`` methods in Series / DataFrame
- New ``tz_convert`` and ``tz_localize`` methods in Series / DataFrame
- Convert DatetimeIndexes to UTC if time zones are different in join/setops
(#864)
- Add limit argument for forward/backward filling to reindex, fillna,
Expand Down Expand Up @@ -86,7 +86,10 @@ pandas 0.8.0
- Add lag plot (#1440)
- Add autocorrelation_plot (#1425)
- Add support for tox and Travis CI (#1382)
- Add support for ordered factors and use in GroupBy (#292)
- Add support for Categorical use in GroupBy (#292)
- Add ``any`` and ``all`` methods to DataFrame (#1416)
- Add ``secondary_y`` option to Series.plot
- Add experimental ``lreshape`` function for reshaping wide to long

**Improvements to existing features**

Expand Down Expand Up @@ -124,9 +127,20 @@ pandas 0.8.0
- Add ``convert_dtype`` option to Series.apply to be able to leave data as
dtype=object (#1414)
- Can specify all index level names in concat (#1419)
- Add ``dialect`` keyword to parsers for quoting conventions (#1363)
- Enable DataFrame[bool_DataFrame] += value (#1366)
- Add ``retries`` argument to ``get_data_yahoo`` to try to prevent Yahoo! API
404s (#826)
- Improve performance of reshaping by using O(N) categorical sorting
- Series names will be used for index of DataFrame if no index passed (#1494)
- Header argument in DataFrame.to_csv can accept a list of column names to
use instead of the object's columns (#921)
- Add ``raise_conflict`` argument to DataFrame.update (#1526)

**API Changes**

- Rename Factor to Categorical and add improvements. Numerous Categorical bug
fixes
- Frequency name overhaul, WEEKDAY/EOM and rules with @
deprecated. get_legacy_offset_name backwards compatibility function added
- Raise ValueError in DataFrame.__nonzero__, so "if df" no longer works
Expand Down Expand Up @@ -190,6 +204,11 @@ pandas 0.8.0
- Fix outer/inner DataFrame.join with non-unique indexes (#1421)
- Fix MultiIndex groupby bugs with empty lower levels (#1401)
- Calling fillna with a Series will have same behavior as with dict (#1486)
- SparseSeries reduction bug (#1375)
- Fix unicode serialization issue in HDFStore (#1361)
- Pass keywords to pyplot.boxplot in DataFrame.boxplot (#1493)
- Bug fixes in MonthBegin (#1483)
- Preserve MultiIndex names in drop (#1513)

pandas 0.7.3
============
Expand Down
24 changes: 24 additions & 0 deletions doc/source/gotchas.rst
Original file line number Diff line number Diff line change
Expand Up @@ -217,3 +217,27 @@ passed in the index, thus finding the integers ``0`` and ``1``. While it would
be possible to insert some logic to check whether a passed sequence is all
contained in the index, that logic would exact a very high cost in large data
sets.

Timestamp limitations
---------------------

Minimum and maximum timestamps
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Since pandas represents timestamps in nanosecond resolution, the timespan that
can be represented using a 64-bit integer is limited to approximately 584 years:

.. ipython:: python
begin = Timestamp(-9223285636854775809L)
begin
end = Timestamp(np.iinfo(np.int64).max)
end
If you need to represent time series data outside the nanosecond timespan, use
PeriodIndex:

.. ipython:: python
span = period_range('1215-01-01', '1381-01-01', freq='D')
span
6 changes: 4 additions & 2 deletions doc/source/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -83,8 +83,10 @@ data into a DataFrame object. They can take a number of arguments:
as the index.
- ``names``: List of column names to use. If passed, header will be
implicitly set to None.
- ``na_values``: optional list of strings to recognize as NaN (missing values),
in addition to a default set.
- ``na_values``: optional list of strings to recognize as NaN (missing
values), in addition to a default set. If you pass an empty list or an
empty list for a particular column, no values (including empty strings)
will be considered NA
- ``parse_dates``: if True then index will be parsed as dates
(False by default). You can specify more complicated options to parse
a subset of columns or a combination of columns into a single date column
Expand Down
2 changes: 1 addition & 1 deletion doc/source/timeseries.rst
Original file line number Diff line number Diff line change
Expand Up @@ -297,7 +297,7 @@ We could have done the same thing with ``DateOffset``:

.. ipython:: python
from pandas.core.datetools import *
from pandas.tseries.offsets import *
d + DateOffset(months=4, days=5)
The key features of a ``DateOffset`` object are:
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
66 changes: 53 additions & 13 deletions doc/source/whatsnew/v0.8.0.txt → doc/source/v0.8.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -67,15 +67,16 @@ Time series changes and improvements
PeriodIndex and DatetimeIndex
- New Timestamp data type subclasses `datetime.datetime`, providing the same
interface while enabling working with nanosecond-resolution data. Also
provides **easy time zone conversions**
- Enhanced support for **time zones**. Add `tz_convert` methods to TimeSeries
and DataFrame. All timestamps are stored as UTC; Timestamps from
DatetimeIndex objects with time zone set will be localized to localtime. Time
zone conversions are therefore essentially free. User needs to know very
little about pytz library now; only time zone names as as strings are
required. Timestamps are equal if and only if their UTC timestamps
match. Operations between time series with different time zones will result
in a UTC-indexed time series
provides :ref:`easy time zone conversions <timeseries.timezone>`.
- Enhanced support for :ref:`time zones <timeseries.timezone>`. Add
`tz_convert` and ``tz_lcoalize`` methods to TimeSeries and DataFrame. All
timestamps are stored as UTC; Timestamps from DatetimeIndex objects with time
zone set will be localized to localtime. Time zone conversions are therefore
essentially free. User needs to know very little about pytz library now; only
time zone names as as strings are required. Time zone-aware timestamps are
equal if and only if their UTC timestamps match. Operations between time
zone-aware time series with different time zones will result in a UTC-indexed
time series.
- Time series **string indexing conveniences** / shortcuts: slice years, year
and month, and index values with strings
- Enhanced time series **plotting**; adaptation of scikits.timeseries
Expand Down Expand Up @@ -111,8 +112,11 @@ index duplication in many-to-many joins)
Other new features
~~~~~~~~~~~~~~~~~~

- New :ref:`cut <reshaping.tile.cut>` function (like R's cut function) for
computing a categorical variable from a continuous variable by binning values
- New :ref:`cut <reshaping.tile.cut>` and ``qcut`` functions (like R's cut
function) for computing a categorical variable from a continuous variable by
binning values either into value-based (``cut``) or quantile-based (``qcut``)
bins
- Rename ``Factor`` to ``Categorical`` and add a number of usability features
- Add :ref:`limit <missing_data.fillna.limit>` argument to fillna/reindex
- More flexible multiple function application in GroupBy, and can pass list
(name, function) tuples to get result in particular order with given names
Expand All @@ -133,8 +137,8 @@ Other new features
memory usage than Python's dict
- Add first, last, min, max, and prod optimized GroupBy functions
- New :ref:`ordered_merge <merging.ordered_merge>` function
- Add flexible :ref:`comparison <basics.binop>` instance methods eq, ne, lt, gt, etc. to DataFrame,
Series
- Add flexible :ref:`comparison <basics.binop>` instance methods eq, ne, lt,
gt, etc. to DataFrame, Series
- Improve :ref:`scatter_matrix <visualization.scatter_matrix>` plotting
function and add histogram or kernel density estimates to diagonal
- Add :ref:`'kde' <visualization.kde>` plot option for density plots
Expand All @@ -146,6 +150,42 @@ Other new features
- Can select multiple columns from GroupBy
- Add :ref:`update <merging.combine_first.update>` methods to Series/DataFrame
for updating values in place
- Add ``any`` and ``all method to DataFrame

New plotting methods
~~~~~~~~~~~~~~~~~~~~

.. ipython:: python
:suppress:

import pandas as pd
fx = pd.load('data/fx_prices')
import matplotlib.pyplot as plt

``Series.plot`` now supports a ``secondary_y`` option:

.. ipython:: python

plt.figure()

fx['FR'].plot(style='g')

@savefig whatsnew_secondary_y.png width=4.5in
fx['IT'].plot(style='k--', secondary_y=True)

Vytautas Jancauskas, the 2012 GSOC participant, has added many new plot
types. For example, ``'kde'`` is a new option:

.. ipython:: python

s = Series(np.concatenate((np.random.randn(1000),
np.random.randn(1000) * 0.5 + 3)))
plt.figure()
s.hist(normed=True, alpha=0.2)
@savefig whatsnew_kde.png width=4.5in
s.plot(kind='kde')

See :ref:`the plotting page <visualization.other>` for much more.

Other API changes
~~~~~~~~~~~~~~~~~
Expand Down
16 changes: 16 additions & 0 deletions doc/source/visualization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,20 @@ You may pass ``logy`` to get a log-scale Y axis.
@savefig series_plot_logy.png width=4.5in
ts.plot(logy=True)
Plotting on a Secondary Y-axis
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

To plot data on a secondary y-axis, use the ``secondary_y`` keyword:

.. ipython:: python
plt.figure()
df.A.plot()
@savefig series_plot_secondary_y.png width=4.5in
df.B.plot(secondary_y=True, style='g')
Targeting different subplots
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand All @@ -107,6 +121,8 @@ You can pass an ``ax`` argument to ``Series.plot`` to plot on a particular axis:
@savefig series_plot_multi.png width=4.5in
df['D'].plot(ax=axes[1,1]); axes[1,1].set_title('D')
.. _visualization.other:

Other plotting features
-----------------------

Expand Down
18 changes: 9 additions & 9 deletions doc/source/whatsnew.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,21 +16,21 @@ What's New

These are new features and improvements of note in each release.

.. include:: whatsnew/v0.8.0.txt
.. include:: v0.8.0.txt

.. include:: whatsnew/v0.7.3.txt
.. include:: v0.7.3.txt

.. include:: whatsnew/v0.7.2.txt
.. include:: v0.7.2.txt

.. include:: whatsnew/v0.7.1.txt
.. include:: v0.7.1.txt

.. include:: whatsnew/v0.7.0.txt
.. include:: v0.7.0.txt

.. include:: whatsnew/v0.6.1.txt
.. include:: v0.6.1.txt

.. include:: whatsnew/v0.6.0.txt
.. include:: v0.6.0.txt

.. include:: whatsnew/v0.5.0.txt
.. include:: v0.5.0.txt

.. include:: whatsnew/v0.4.x.txt
.. include:: v0.4.x.txt

3 changes: 2 additions & 1 deletion pandas/core/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,8 @@
from pandas.core.frame import DataFrame
from pandas.core.panel import Panel
from pandas.core.groupby import groupby
from pandas.core.reshape import pivot_simple as pivot, get_dummies
from pandas.core.reshape import (pivot_simple as pivot, get_dummies,
lreshape)

WidePanel = Panel

Expand Down
12 changes: 12 additions & 0 deletions pandas/core/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -914,3 +914,15 @@ def writerow(self, row):
self.stream.write(data)
# empty queue
self.queue.truncate(0)


_NS_DTYPE = np.dtype('M8[ns]')

def _concat_compat(to_concat):
if all(x.dtype == _NS_DTYPE for x in to_concat):
# work around NumPy 1.6 bug
new_values = np.concatenate([x.view(np.int64) for x in to_concat])
return new_values.view(_NS_DTYPE)
else:
return np.concatenate(to_concat)

14 changes: 1 addition & 13 deletions pandas/core/format.py
Original file line number Diff line number Diff line change
Expand Up @@ -594,19 +594,7 @@ def _format_datetime64(x, tz=None):
return 'NaT'

stamp = lib.Timestamp(x, tz=tz)
base = stamp.strftime('%Y-%m-%d %H:%M:%S')

fraction = stamp.microsecond * 1000 + stamp.nanosecond
digits = 9

if fraction == 0:
return base

while (fraction % 10) == 0:
fraction /= 10
digits -= 1

return base + ('.%%.%id' % digits) % fraction
return stamp._repr_base


def _make_fixed_width(strings, justify='right'):
Expand Down
Loading

0 comments on commit 1f2b250

Please sign in to comment.