Skip to content

Commit 568a1fc

Browse files
authored
Merge pull request #273 from zonca/fix-268-no-modify-input-table
Do not modify input astropy table when converting to pandas DataFrame (fixes #268)
2 parents 43955ba + 6e3f826 commit 568a1fc

2 files changed

Lines changed: 72 additions & 73 deletions

File tree

episodes/03-transform.md

Lines changed: 65 additions & 66 deletions
Original file line numberDiff line numberDiff line change
@@ -501,66 +501,9 @@ types can be awkward. It will be more convenient to choose one object and get al
501501
data into it.
502502

503503
Now we can extract the columns we want from `skycoord_gd1` and add
504-
them as columns in the Astropy `Table` `polygon_results`. `phi1` and `phi2` contain the
504+
them to a Pandas `DataFrame`. `phi1` and `phi2` contain the
505505
transformed coordinates.
506506

507-
```python
508-
polygon_results['phi1'] = skycoord_gd1.phi1
509-
polygon_results['phi2'] = skycoord_gd1.phi2
510-
polygon_results.info()
511-
```
512-
513-
```output
514-
<Table length=140339>
515-
name dtype unit description class
516-
--------- ------- -------- ------------------------------------------------------------------ ------------
517-
source_id int64 Unique source identifier (unique within a particular Data Release) MaskedColumn
518-
ra float64 deg Right ascension MaskedColumn
519-
dec float64 deg Declination MaskedColumn
520-
pmra float64 mas / yr Proper motion in right ascension direction MaskedColumn
521-
pmdec float64 mas / yr Proper motion in declination direction MaskedColumn
522-
parallax float64 mas Parallax MaskedColumn
523-
phi1 float64 deg Column
524-
phi2 float64 deg Column
525-
```
526-
527-
`pm_phi1_cosphi2` and `pm_phi2` contain the components of proper
528-
motion in the transformed frame.
529-
530-
```python
531-
polygon_results['pm_phi1'] = skycoord_gd1.pm_phi1_cosphi2
532-
polygon_results['pm_phi2'] = skycoord_gd1.pm_phi2
533-
polygon_results.info()
534-
```
535-
536-
```output
537-
<Table length=140339>
538-
name dtype unit description class
539-
--------- ------- -------- ------------------------------------------------------------------ ------------
540-
source_id int64 Unique source identifier (unique within a particular Data Release) MaskedColumn
541-
ra float64 deg Right ascension MaskedColumn
542-
dec float64 deg Declination MaskedColumn
543-
pmra float64 mas / yr Proper motion in right ascension direction MaskedColumn
544-
pmdec float64 mas / yr Proper motion in declination direction MaskedColumn
545-
parallax float64 mas Parallax MaskedColumn
546-
phi1 float64 deg Column
547-
phi2 float64 deg Column
548-
pm_phi1 float64 mas / yr Column
549-
pm_phi2 float64 mas / yr Column
550-
```
551-
552-
::::::::::::::::::::::::::::::::::::::::: callout
553-
554-
Detail
555-
If you notice that `SkyCoord` has an attribute called
556-
`proper_motion`, you might wonder why we are not using it.
557-
558-
We could have: `proper_motion` contains the same data as
559-
`pm_phi1_cosphi2` and `pm_phi2`, but in a different format.
560-
561-
562-
::::::::::::::::::::::::::::::::::::::::::::::::::
563-
564507
::::::::::::::::::::::::::::::::::::::::: callout
565508

566509
## Pandas `DataFrame`s versus Astropy `Table`s
@@ -604,7 +547,7 @@ results_df.shape
604547
```
605548

606549
```output
607-
(140339, 10)
550+
(140339, 6)
608551
```
609552

610553
It also provides `head`, which displays the first few rows. `head` is
@@ -614,6 +557,38 @@ useful for spot-checking large results as you go along.
614557
results_df.head()
615558
```
616559

560+
```output
561+
source_id ra dec pmra pmdec parallax
562+
0 637987125186749568 142.483019 21.757716 -2.516838 2.941813 -0.257345
563+
1 638285195917112960 142.254529 22.476168 2.662702 -12.165984 0.422728
564+
2 638073505568978688 142.645286 22.166932 18.306747 -7.950660 0.103640
565+
3 638086386175786752 142.577394 22.227920 0.987786 -2.584105 -0.857327
566+
4 638049655615392384 142.589136 22.110783 0.244439 -4.941079 0.099625
567+
568+
```
569+
570+
Now we can add the GD-1 coordinates and proper motions as columns in the
571+
`DataFrame`. We use the `.value` attribute to extract the numerical values
572+
without units, since Pandas `DataFrame`s do not preserve astropy units.
573+
574+
```python
575+
results_df['phi1'] = skycoord_gd1.phi1.value
576+
results_df['phi2'] = skycoord_gd1.phi2.value
577+
results_df['pm_phi1'] = skycoord_gd1.pm_phi1_cosphi2.value
578+
results_df['pm_phi2'] = skycoord_gd1.pm_phi2.value
579+
results_df.shape
580+
```
581+
582+
```output
583+
(140339, 10)
584+
```
585+
586+
And we can check the result with `head`:
587+
588+
```python
589+
results_df.head()
590+
```
591+
617592
```output
618593
source_id ra dec pmra pmdec parallax phi1 phi2 pm_phi1 pm_phi2
619594
0 637987125186749568 142.483019 21.757716 -2.516838 2.941813 -0.257345 -54.975623 -3.659349 6.429945 6.518157
@@ -626,6 +601,30 @@ results_df.head()
626601

627602
::::::::::::::::::::::::::::::::::::::::: callout
628603

604+
## Why `.value`?
605+
606+
The attributes of a `SkyCoord` object, like `phi1` and `phi2`,
607+
are `Quantity` objects that carry units (for example, degrees).
608+
Pandas `DataFrame`s do not support `Quantity` columns, so we use
609+
the `.value` attribute to extract the numerical values without units.
610+
611+
612+
::::::::::::::::::::::::::::::::::::::::::::::::::
613+
614+
::::::::::::::::::::::::::::::::::::::::: callout
615+
616+
Detail
617+
If you notice that `SkyCoord` has an attribute called
618+
`proper_motion`, you might wonder why we are not using it.
619+
620+
We could have: `proper_motion` contains the same data as
621+
`pm_phi1_cosphi2` and `pm_phi2`, but in a different format.
622+
623+
624+
::::::::::::::::::::::::::::::::::::::::::::::::::
625+
626+
::::::::::::::::::::::::::::::::::::::::: callout
627+
629628
## Attributes vs functions
630629

631630
`shape` is an attribute, so we display its value
@@ -649,7 +648,7 @@ to copy and paste the code over and over again.
649648

650649
```python
651650
def make_dataframe(table):
652-
"""Transform coordinates from ICRS to GD-1 frame.
651+
"""Transform and astropy table with coords in ICRS, convert to pandas dataframe with GD-1 coordinates.
653652
654653
table: Astropy Table
655654
@@ -674,15 +673,15 @@ def make_dataframe(table):
674673
# Correct GD-1 coordinates for solar system motion around galactic center
675674
skycoord_gd1 = reflex_correct(transformed)
676675

677-
#Add GD-1 reference frame columns for coordinates and proper motions
678-
table['phi1'] = skycoord_gd1.phi1
679-
table['phi2'] = skycoord_gd1.phi2
680-
table['pm_phi1'] = skycoord_gd1.pm_phi1_cosphi2
681-
table['pm_phi2'] = skycoord_gd1.pm_phi2
682-
683676
# Create DataFrame
684677
df = table.to_pandas()
685678

679+
# Add GD-1 reference frame columns for coordinates and proper motions
680+
df['phi1'] = skycoord_gd1.phi1.value
681+
df['phi2'] = skycoord_gd1.phi2.value
682+
df['pm_phi1'] = skycoord_gd1.pm_phi1_cosphi2.value
683+
df['pm_phi2'] = skycoord_gd1.pm_phi2.value
684+
686685
return df
687686
```
688687

student_download/episode_functions.py

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ def skycoord_to_string(skycoord):
2929
# Episode 3
3030
##########################
3131
def make_dataframe(table):
32-
"""Transform coordinates from ICRS to GD-1 frame.
32+
"""Transform and astropy table with coords in ICRS, convert to pandas dataframe with GD-1 coordinates.
3333
3434
table: Astropy Table
3535
@@ -54,15 +54,15 @@ def make_dataframe(table):
5454
# Correct GD-1 coordinates for solar system motion around galactic center
5555
skycoord_gd1 = reflex_correct(transformed)
5656

57-
#Add GD-1 reference frame columns for coordinates and proper motions
58-
table['phi1'] = skycoord_gd1.phi1
59-
table['phi2'] = skycoord_gd1.phi2
60-
table['pm_phi1'] = skycoord_gd1.pm_phi1_cosphi2
61-
table['pm_phi2'] = skycoord_gd1.pm_phi2
62-
6357
# Create DataFrame
6458
df = table.to_pandas()
6559

60+
# Add GD-1 reference frame columns for coordinates and proper motions
61+
df['phi1'] = skycoord_gd1.phi1.value
62+
df['phi2'] = skycoord_gd1.phi2.value
63+
df['pm_phi1'] = skycoord_gd1.pm_phi1_cosphi2.value
64+
df['pm_phi2'] = skycoord_gd1.pm_phi2.value
65+
6666
return df
6767

6868
##########################

0 commit comments

Comments
 (0)