Skip to content

Commit

Permalink
DOC: Grammar improvements in getting started tutorials (#58706)
Browse files Browse the repository at this point in the history
  • Loading branch information
Aloqeely committed May 13, 2024
1 parent 4829b36 commit b195361
Show file tree
Hide file tree
Showing 9 changed files with 30 additions and 32 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -192,8 +192,8 @@ Check more options on ``describe`` in the user guide section about :ref:`aggrega
.. note::
This is just a starting point. Similar to spreadsheet
software, pandas represents data as a table with columns and rows. Apart
from the representation, also the data manipulations and calculations
you would do in spreadsheet software are supported by pandas. Continue
from the representation, the data manipulations and calculations
you would do in spreadsheet software are also supported by pandas. Continue
reading the next tutorials to get started!

.. raw:: html
Expand All @@ -204,7 +204,7 @@ Check more options on ``describe`` in the user guide section about :ref:`aggrega
- Import the package, aka ``import pandas as pd``
- A table of data is stored as a pandas ``DataFrame``
- Each column in a ``DataFrame`` is a ``Series``
- You can do things by applying a method to a ``DataFrame`` or ``Series``
- You can do things by applying a method on a ``DataFrame`` or ``Series``

.. raw:: html

Expand All @@ -215,7 +215,7 @@ Check more options on ``describe`` in the user guide section about :ref:`aggrega
<div class="d-flex flex-row gs-torefguide">
<span class="badge badge-info">To user guide</span>

A more extended explanation to ``DataFrame`` and ``Series`` is provided in the :ref:`introduction to data structures <dsintro>`.
A more extended explanation of ``DataFrame`` and ``Series`` is provided in the :ref:`introduction to data structures <dsintro>` page.

.. raw:: html

Expand Down
10 changes: 5 additions & 5 deletions doc/source/getting_started/intro_tutorials/02_read_write.rst
Original file line number Diff line number Diff line change
Expand Up @@ -172,11 +172,11 @@ The method :meth:`~DataFrame.info` provides technical information about a
- The table has 12 columns. Most columns have a value for each of the
rows (all 891 values are ``non-null``). Some columns do have missing
values and less than 891 ``non-null`` values.
- The columns ``Name``, ``Sex``, ``Cabin`` and ``Embarked`` consists of
- The columns ``Name``, ``Sex``, ``Cabin`` and ``Embarked`` consist of
textual data (strings, aka ``object``). The other columns are
numerical data with some of them whole numbers (aka ``integer``) and
others are real numbers (aka ``float``).
- The kind of data (characters, integers,…) in the different columns
numerical data, some of them are whole numbers (``integer``) and
others are real numbers (``float``).
- The kind of data (characters, integers, …) in the different columns
are summarized by listing the ``dtypes``.
- The approximate amount of RAM used to hold the DataFrame is provided
as well.
Expand All @@ -194,7 +194,7 @@ The method :meth:`~DataFrame.info` provides technical information about a
- Getting data in to pandas from many different file formats or data
sources is supported by ``read_*`` functions.
- Exporting data out of pandas is provided by different
``to_*``\ methods.
``to_*`` methods.
- The ``head``/``tail``/``info`` methods and the ``dtypes`` attribute
are convenient for a first check.

Expand Down
10 changes: 4 additions & 6 deletions doc/source/getting_started/intro_tutorials/03_subset_data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -300,7 +300,7 @@ want to select.
</li>
</ul>

When using the column names, row labels or a condition expression, use
When using column names, row labels or a condition expression, use
the ``loc`` operator in front of the selection brackets ``[]``. For both
the part before and after the comma, you can use a single label, a list
of labels, a slice of labels, a conditional expression or a colon. Using
Expand Down Expand Up @@ -342,7 +342,7 @@ the name ``anonymous`` to the first 3 elements of the fourth column:
<div class="d-flex flex-row gs-torefguide">
<span class="badge badge-info">To user guide</span>

See the user guide section on :ref:`different choices for indexing <indexing.choice>` to get more insight in the usage of ``loc`` and ``iloc``.
See the user guide section on :ref:`different choices for indexing <indexing.choice>` to get more insight into the usage of ``loc`` and ``iloc``.

.. raw:: html

Expand All @@ -357,10 +357,8 @@ See the user guide section on :ref:`different choices for indexing <indexing.cho
- Inside these square brackets, you can use a single column/row label, a list
of column/row labels, a slice of labels, a conditional expression or
a colon.
- Select specific rows and/or columns using ``loc`` when using the row
and column names.
- Select specific rows and/or columns using ``iloc`` when using the
positions in the table.
- Use ``loc`` for label-based selection (using row/column names).
- Use ``iloc`` for position-based selection (using table positions).
- You can assign new values to a selection based on ``loc``/``iloc``.

.. raw:: html
Expand Down
6 changes: 3 additions & 3 deletions doc/source/getting_started/intro_tutorials/04_plotting.rst
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ I want to plot only the columns of the data table with the data from Paris.
air_quality["station_paris"].plot()
plt.show()
To plot a specific column, use the selection method of the
To plot a specific column, use a selection method from the
:ref:`subset data tutorial <10min_tut_03_subset>` in combination with the :meth:`~DataFrame.plot`
method. Hence, the :meth:`~DataFrame.plot` method works on both ``Series`` and
``DataFrame``.
Expand Down Expand Up @@ -127,7 +127,7 @@ standard Python to get an overview of the available plot methods:
]
.. note::
In many development environments as well as IPython and
In many development environments such as IPython and
Jupyter Notebook, use the TAB button to get an overview of the available
methods, for example ``air_quality.plot.`` + TAB.

Expand Down Expand Up @@ -238,7 +238,7 @@ This strategy is applied in the previous example:

- The ``.plot.*`` methods are applicable on both Series and DataFrames.
- By default, each of the columns is plotted as a different element
(line, boxplot,…).
(line, boxplot, …).
- Any plot created by pandas is a Matplotlib object.

.. raw:: html
Expand Down
4 changes: 2 additions & 2 deletions doc/source/getting_started/intro_tutorials/05_add_columns.rst
Original file line number Diff line number Diff line change
Expand Up @@ -89,8 +89,8 @@ values in each row*.
</li>
</ul>

Also other mathematical operators (``+``, ``-``, ``*``, ``/``,…) or
logical operators (``<``, ``>``, ``==``,…) work element-wise. The latter was already
Other mathematical operators (``+``, ``-``, ``*``, ``/``, …) and logical
operators (``<``, ``>``, ``==``, …) also work element-wise. The latter was already
used in the :ref:`subset data tutorial <10min_tut_03_subset>` to filter
rows of a table using a conditional expression.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -235,7 +235,7 @@ category in a column.
</li>
</ul>

The function is a shortcut, as it is actually a groupby operation in combination with counting of the number of records
The function is a shortcut, it is actually a groupby operation in combination with counting the number of records
within each group:

.. ipython:: python
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -137,7 +137,7 @@ Hence, the resulting table has 3178 = 1110 + 2068 rows.
Most operations like concatenation or summary statistics are by default
across rows (axis 0), but can be applied across columns as well.

Sorting the table on the datetime information illustrates also the
Sorting the table on the datetime information also illustrates the
combination of both tables, with the ``parameter`` column defining the
origin of the table (either ``no2`` from table ``air_quality_no2`` or
``pm25`` from table ``air_quality_pm25``):
Expand Down Expand Up @@ -286,7 +286,7 @@ between the two tables.
<div class="d-flex flex-row gs-torefguide">
<span class="badge badge-info">To user guide</span>

pandas supports also inner, outer, and right joins.
pandas also supports inner, outer, and right joins.
More information on join/merge of tables is provided in the user guide section on
:ref:`database style merging of tables <merging.join>`. Or have a look at the
:ref:`comparison with SQL<compare_with_sql.join>` page.
Expand All @@ -300,7 +300,7 @@ More information on join/merge of tables is provided in the user guide section o
<div class="shadow gs-callout gs-callout-remember">
<h4>REMEMBER</h4>

- Multiple tables can be concatenated both column-wise and row-wise using
- Multiple tables can be concatenated column-wise or row-wise using
the ``concat`` function.
- For database-like merging/joining of tables, use the ``merge``
function.
Expand Down
10 changes: 5 additions & 5 deletions doc/source/getting_started/intro_tutorials/09_timeseries.rst
Original file line number Diff line number Diff line change
Expand Up @@ -77,9 +77,9 @@ I want to work with the dates in the column ``datetime`` as datetime objects ins
Initially, the values in ``datetime`` are character strings and do not
provide any datetime operations (e.g. extract the year, day of the
week,…). By applying the ``to_datetime`` function, pandas interprets the
week, …). By applying the ``to_datetime`` function, pandas interprets the
strings and convert these to datetime (i.e. ``datetime64[ns, UTC]``)
objects. In pandas we call these datetime objects similar to
objects. In pandas we call these datetime objects that are similar to
``datetime.datetime`` from the standard library as :class:`pandas.Timestamp`.

.. raw:: html
Expand Down Expand Up @@ -117,7 +117,7 @@ length of our time series:
air_quality["datetime"].max() - air_quality["datetime"].min()
The result is a :class:`pandas.Timedelta` object, similar to ``datetime.timedelta``
from the standard Python library and defining a time duration.
from the standard Python library which defines a time duration.

.. raw:: html

Expand Down Expand Up @@ -257,7 +257,7 @@ the adapted time scale on plots. Let’s apply this on our data.
<ul class="task-bullet">
<li>

Create a plot of the :math:`NO_2` values in the different stations from the 20th of May till the end of 21st of May
Create a plot of the :math:`NO_2` values in the different stations from May 20th till the end of May 21st.

.. ipython:: python
:okwarning:
Expand Down Expand Up @@ -310,7 +310,7 @@ converting secondly data into 5-minutely data).
The :meth:`~Series.resample` method is similar to a groupby operation:

- it provides a time-based grouping, by using a string (e.g. ``M``,
``5H``,…) that defines the target frequency
``5H``, …) that defines the target frequency
- it requires an aggregation function such as ``mean``, ``max``,…

.. raw:: html
Expand Down
6 changes: 3 additions & 3 deletions doc/source/getting_started/intro_tutorials/10_text_data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -134,8 +134,8 @@ only one countess on the Titanic, we get one row as a result.
.. note::
More powerful extractions on strings are supported, as the
:meth:`Series.str.contains` and :meth:`Series.str.extract` methods accept `regular
expressions <https://docs.python.org/3/library/re.html>`__, but out of
scope of this tutorial.
expressions <https://docs.python.org/3/library/re.html>`__, but are out of
the scope of this tutorial.

.. raw:: html

Expand Down Expand Up @@ -200,7 +200,7 @@ In the "Sex" column, replace values of "male" by "M" and values of "female" by "
Whereas :meth:`~Series.replace` is not a string method, it provides a convenient way
to use mappings or vocabularies to translate certain values. It requires
a ``dictionary`` to define the mapping ``{from : to}``.
a ``dictionary`` to define the mapping ``{from: to}``.

.. raw:: html

Expand Down

0 comments on commit b195361

Please sign in to comment.