Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete docs/developer_center directory #1683

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
c055457
Doc: fixed credentials link for set_defautl_credentials
Jul 20, 2020
a6de8ef
Removed the batch_size parameter in the call to cdb_bulk_geocode_stre…
antoniocarlon Jul 21, 2020
a5151cb
Merge pull request #1666 from CartoDB/feature/ch91890/sociallydetermi…
antoniocarlon Jul 21, 2020
3db4696
Allow to set a value for null geometries
antoniocarlon Jul 23, 2020
8be7fb1
Add documentation for executing a single test
antoniocarlon Jul 23, 2020
c5cb37d
Modified changelog
antoniocarlon Jul 23, 2020
5bc6f1f
Merge pull request #1668 from CartoDB/antoniocarlon/null/documentatio…
antoniocarlon Jul 23, 2020
db4ae1e
Merge pull request #1667 from CartoDB/bug/ch94476/axa-group-error-per…
antoniocarlon Jul 23, 2020
a79e77f
Added global messagerfor catalog entities (except countires) when ask…
Jul 30, 2020
13f51e2
Fixed global country message typo
Jul 30, 2020
d3ea77c
Merge pull request #1665 from CartoDB/josema/ch93700/fix-links-in-ref…
Jesus89 Aug 14, 2020
dc45ed2
Merge pull request #1670 from CartoDB/josema/ch90840/visibility-of-gl…
Jesus89 Aug 14, 2020
5f7e56a
Switch from Travis to Github Actions
antoniocarlon Aug 17, 2020
b10fe84
Small fixes
antoniocarlon Aug 17, 2020
ea6d74b
GH actions + tox
antoniocarlon Aug 17, 2020
340afd7
Add dependency
antoniocarlon Aug 17, 2020
44c58e1
Change name
antoniocarlon Aug 17, 2020
e0a8fdf
Removing sort by data_range when retrieving isolines
antoniocarlon Aug 19, 2020
a899d72
Check account disk quotas before writing using to_carto
antoniocarlon Aug 19, 2020
084b6eb
Fixing tests
antoniocarlon Aug 19, 2020
fd60f3b
Added ignore_quota_warning parameter
antoniocarlon Aug 20, 2020
4c1f046
Merge pull request #1673 from CartoDB/bug/ch70796/sort-source-id-when…
Jesus89 Aug 20, 2020
fbcec7e
Merge pull request #1672 from CartoDB/chore/ch88281/github-actions-ci…
Jesus89 Aug 20, 2020
0202547
Merge branch 'develop' into bug/ch93719/check-account-disk-quotas-bef…
antoniocarlon Aug 21, 2020
bc6e82c
add identifier quoting for columns in Context Manager
arredond Aug 21, 2020
ca807d8
proper syntax now
arredond Aug 21, 2020
f47e85e
fix tests
arredond Aug 21, 2020
f808620
Upload table using to_carto in chunks
antoniocarlon Aug 21, 2020
60eb87d
fix context manager test
arredond Aug 21, 2020
34d76ce
Fix source test
Jesus89 Aug 21, 2020
ad0bcbf
Added retry copy decorator
antoniocarlon Aug 24, 2020
9bfc22c
Added tests
antoniocarlon Aug 24, 2020
6dcd268
Tests chunks
antoniocarlon Aug 24, 2020
33ebf64
Fix test
antoniocarlon Aug 24, 2020
9aee2ce
Fix test
antoniocarlon Aug 24, 2020
8214e62
Fix test
antoniocarlon Aug 24, 2020
6beccb5
Fix tests
antoniocarlon Aug 24, 2020
7deed89
Added retry_times to the to_carto function docstring
antoniocarlon Aug 24, 2020
8880794
Precalculate the columns
antoniocarlon Aug 24, 2020
bbbf0e1
Improved size calculations
antoniocarlon Aug 24, 2020
a23fa56
Removed cast
antoniocarlon Aug 24, 2020
dea97b0
Added note about the chunks in the tests
antoniocarlon Aug 24, 2020
728db60
Extract the estimate_csv_size from to_carto
antoniocarlon Aug 24, 2020
de6b944
Get the size using the length of the CSV of a sample and using a conv…
antoniocarlon Aug 24, 2020
40ca547
Merge pull request #1676 from CartoDB/chore/ch93723/optimize-upload-t…
Jesus89 Aug 24, 2020
3cb1ddb
Fix merge develop
antoniocarlon Aug 24, 2020
12fa4d5
Change parameter name. Improved doc, Improved code legibility
antoniocarlon Aug 24, 2020
214a79c
Test skip_quota_warning
antoniocarlon Aug 24, 2020
4c8a751
Added test
antoniocarlon Aug 24, 2020
48f0d58
Refactor double quote format
Jesus89 Aug 24, 2020
a950bf9
Fix uploading extra the_geom fails
antoniocarlon Aug 24, 2020
db6f7e1
Add "copy to" test
Jesus89 Aug 24, 2020
fd7928a
Move managers tests to io
Jesus89 Aug 24, 2020
3323646
Merge branch 'develop' into arredond/identifier-quoting-for-columns
Jesus89 Aug 24, 2020
fc6e1eb
Merge pull request #1675 from CartoDB/arredond/identifier-quoting-for…
Jesus89 Aug 24, 2020
c95588c
Merge pull request #1674 from CartoDB/bug/ch93719/check-account-disk-…
Jesus89 Aug 25, 2020
a3398cc
Fix SQL queries content format
Jesus89 Aug 25, 2020
65080e9
Add test
Jesus89 Aug 25, 2020
4781894
Speed up unit tests (avoid external calls)
Jesus89 Aug 25, 2020
fbed888
Merge pull request #1678 from CartoDB/fix/wrong-sql-queries
Jesus89 Aug 25, 2020
3e287eb
Fixed error when geom_col is not None
antoniocarlon Aug 25, 2020
09e9434
Merge pull request #1677 from CartoDB/bug/ch93715/uploading-extra-the…
Jesus89 Aug 25, 2020
4dcc7ab
Deleted developer center docs
antoniocarlon Aug 27, 2020
c2d6739
Merge branch 'develop' into antonio/ch93707/repo-clean-up-scripts-doc…
antoniocarlon Aug 27, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions .github/workflows/cartoframes-ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
name: Run CARTOFrames tests

on:
push:
pull_request:
branches:
- master
- develop

jobs:
test:
strategy:
matrix:
python-version: [3.5, 3.6, 3.7, 3.8]

name: Run tests on Python ${{ matrix.python-version }}

runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v2

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v1
with:
python-version: ${{ matrix.python-version }}

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install tox
pip install tox-gh-actions

- name: Test with tox
run: |
tox
12 changes: 0 additions & 12 deletions .travis.yml

This file was deleted.

11 changes: 8 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,19 +5,24 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Pending]

### Changed
- Allow to set a value for null geometries in the `read_carto` and `Geocoding.geocode` methods (#1667)

## [1.0.4] - 2020-07-06

## Added
### Added
- Add list_tables function (#1649)
- Add catalog public filter to providers, countries and categories (#1658)
- Add set_default_do_credentials function for DO authentication (#1655)

## Changed
### Changed
- Open publication link in another window (#1647)
- Show a warning when uploading a GeoDataFrame without geometry (#1650)
- Improve GeoDataFrame CRS check, docs and examples (#1656)

## Fixed
### Fixed
- Fix empty geometries issue (#1652)
- Fix Layout publication API key issue (#1654)
- Fix ColumnInfo comparison when replacing a table (#1660)
Expand Down
14 changes: 12 additions & 2 deletions cartoframes/auth/credentials.py
Original file line number Diff line number Diff line change
Expand Up @@ -109,15 +109,25 @@ def session(self, session):
"""Set session"""
self._session = session

@property
def me_data(self):
me_data = {}

try:
me_data = self.get_api_key_auth_client().send(ME_SERVICE, 'get').json()
except Exception:
pass

return me_data

@property
def user_id(self):
"""Credentials user ID"""
if not self._user_id:
log.debug('Getting `user_id` for {}'.format(self._username))
api_key_auth_client = self.get_api_key_auth_client()

try:
user_me = api_key_auth_client.send(ME_SERVICE, 'get').json()
user_me = self.me_data()
user_data = user_me.get('user_data')
if user_data:
self._user_id = user_data.get('id')
Expand Down
2 changes: 1 addition & 1 deletion cartoframes/auth/defaults.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ def set_default_credentials(

Args:
credentials (:py:class:`Credentials <cartoframes.credentials.Credentials>`, optional):
A :py:class:`Credentials <cartoframes.credentials.Credentials>`
A :py:class:`Credentials <cartoframes.auth.Credentials>`
instance can be used in place of a `username | base_url`/`api_key` combination.
base_url (str, optional): Base URL of CARTO user account. Cloud-based accounts
should use the form ``https://{username}.carto.com`` (e.g.,
Expand Down
13 changes: 12 additions & 1 deletion cartoframes/data/observatory/catalog/catalog.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,10 @@
from .dataset import Dataset
from .geography import Geography
from .subscriptions import Subscriptions
from .repository.constants import COUNTRY_FILTER, CATEGORY_FILTER, GEOGRAPHY_FILTER, PROVIDER_FILTER, PUBLIC_FILTER
from .repository.constants import (COUNTRY_FILTER, CATEGORY_FILTER, GEOGRAPHY_FILTER, GLOBAL_COUNTRY_FILTER,
PROVIDER_FILTER, PUBLIC_FILTER)

from ....utils.logger import log
from ....utils.utils import get_credentials


Expand Down Expand Up @@ -142,6 +144,7 @@ def categories(self):
CatalogError: if there's a problem when connecting to the catalog or no datasets are found.

"""
self._global_message()
return Category.get_all(self.filters)

@property
Expand All @@ -155,6 +158,7 @@ def providers(self):
CatalogError: if there's a problem when connecting to the catalog or no datasets are found.

"""
self._global_message()
return Provider.get_all(self.filters)

@property
Expand All @@ -168,6 +172,7 @@ def datasets(self):
CatalogError: if there's a problem when connecting to the catalog or no datasets are found.

"""
self._global_message()
return Dataset.get_all(self.filters)

@property
Expand All @@ -181,6 +186,7 @@ def geographies(self):
CatalogError: if there's a problem when connecting to the catalog or no datasets are found.

"""
self._global_message()
return Geography.get_all(self.filters)

def country(self, country_id):
Expand Down Expand Up @@ -295,3 +301,8 @@ def datasets_filter(self, filter_dataset):

"""
return Dataset.get_datasets_spatial_filtered(filter_dataset)

def _global_message(self):
if self.filters and self.filters.get(COUNTRY_FILTER) != GLOBAL_COUNTRY_FILTER:
log.info('You can find more entities with the Global country filter. To apply that filter run:'
"\n\tCatalog().country('glo')")
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@
PUBLIC_FILTER = 'public'
VARIABLE_FILTER = 'variable'
VARIABLE_GROUP_FILTER = 'variable_group'
GLOBAL_COUNTRY_FILTER = 'glo'
7 changes: 5 additions & 2 deletions cartoframes/data/services/geocoding.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,8 @@ def geocode(self, source, street,
city=None, state=None, country=None,
status=geocoding_constants.DEFAULT_STATUS,
table_name=None, if_exists='fail',
dry_run=False, cached=None):
dry_run=False, cached=None,
null_geom_value=None):
"""Geocode method.

Args:
Expand Down Expand Up @@ -92,6 +93,8 @@ def geocode(self, source, street,
table. This parameter should be used along with ``table_name``.
dry_run (bool, optional): no actual geocoding will be performed (useful to
check the needed quota)
null_geom_value (Object, optional): value for the `the_geom` column when it's null.
Defaults to None

Returns:
A named-tuple ``(data, metadata)`` containing either a ``data`` geopandas.GeoDataFrame
Expand Down Expand Up @@ -184,7 +187,7 @@ def geocode(self, source, street,
if dry_run:
return self.result(data=None, metadata=metadata)

gdf = read_carto(input_table_name, self._credentials)
gdf = read_carto(input_table_name, self._credentials, null_geom_value=null_geom_value)

if self._source_manager.is_dataframe() and CARTO_INDEX_KEY in gdf:
del gdf[CARTO_INDEX_KEY]
Expand Down
3 changes: 1 addition & 2 deletions cartoframes/data/services/isolines.py
Original file line number Diff line number Diff line change
Expand Up @@ -202,8 +202,7 @@ def _iso_areas(self,
# Execute and download the query to generate the isolines
gdf = read_carto(sql, self._credentials)

# Sorting by `data_range` column and recalculating `cartodb_id`
gdf.sort_values(by=[DATA_RANGE_KEY], ascending=ascending, inplace=True)
# Recalculating `cartodb_id`
gdf.reset_index(drop=True, inplace=True)
if CARTO_INDEX_KEY in gdf.columns:
gdf[CARTO_INDEX_KEY] = gdf.index + 1
Expand Down
3 changes: 0 additions & 3 deletions cartoframes/data/services/utils/geocoding_constants.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
__all__ = [
'HASH_COLUMN',
'BATCH_SIZE',
'DEFAULT_STATUS',
'QUOTA_SERVICE',
'STATUS_FIELDS',
Expand All @@ -12,8 +11,6 @@

HASH_COLUMN = 'carto_geocode_hash'

BATCH_SIZE = 200

DEFAULT_STATUS = {'gc_status_rel': 'relevance'}

QUOTA_SERVICE = 'hires_geocoder'
Expand Down
6 changes: 2 additions & 4 deletions cartoframes/data/services/utils/geocoding_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -144,16 +144,14 @@ def geocode_query(table, schema, street, city, state, country, status):
{street},
{city},
{state},
{country},
{batch_size}
{country}
)
""".format(
query=query,
street=column_name(street),
city=column_name(city),
state=column_name(state),
country=column_name(country),
batch_size=geocoding_constants.BATCH_SIZE
country=column_name(country)
)

status_assignment, status_columns = status_assignment_columns(status)
Expand Down
55 changes: 51 additions & 4 deletions cartoframes/io/carto.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
"""Functions to interact with the CARTO platform"""
import math

from pandas import DataFrame
from geopandas import GeoDataFrame

from carto.exceptions import CartoException

from .managers.context_manager import ContextManager
from .managers.context_manager import ContextManager, _compute_copy_data, get_dataframe_columns_info
from ..utils.geom_utils import check_crs, has_geometry, set_geometry
from ..utils.logger import log
from ..utils.utils import is_valid_str, is_sql_query
Expand All @@ -15,9 +16,14 @@
GEOM_COLUMN_NAME = 'the_geom'
IF_EXISTS_OPTIONS = ['fail', 'replace', 'append']

MAX_UPLOAD_SIZE_BYTES = 2000000000 # 2GB
SAMPLE_ROWS_NUMBER = 100
CSV_TO_CARTO_RATIO = 1.4


@send_metrics('data_downloaded')
def read_carto(source, credentials=None, limit=None, retry_times=3, schema=None, index_col=None, decode_geom=True):
def read_carto(source, credentials=None, limit=None, retry_times=3, schema=None, index_col=None, decode_geom=True,
null_geom_value=None):
"""Read a table or a SQL query from the CARTO account.

Args:
Expand All @@ -32,6 +38,8 @@ def read_carto(source, credentials=None, limit=None, retry_times=3, schema=None,
`current_schema()` using the credentials.
index_col (str, optional): name of the column to be loaded as index. It can be used also to set the index name.
decode_geom (bool, optional): convert the "the_geom" column into a valid geometry column.
null_geom_value (Object, optional): value for the `the_geom` column when it's null.
Defaults to None

Returns:
geopandas.GeoDataFrame
Expand Down Expand Up @@ -59,12 +67,16 @@ def read_carto(source, credentials=None, limit=None, retry_times=3, schema=None,
# Decode geometry column
set_geometry(gdf, GEOM_COLUMN_NAME, inplace=True)

if null_geom_value is not None:
gdf[GEOM_COLUMN_NAME].fillna(null_geom_value, inplace=True)

return gdf


@send_metrics('data_uploaded')
def to_carto(dataframe, table_name, credentials=None, if_exists='fail', geom_col=None, index=False, index_label=None,
cartodbfy=True, log_enabled=True):
cartodbfy=True, log_enabled=True, retry_times=3, max_upload_size=MAX_UPLOAD_SIZE_BYTES,
skip_quota_warning=False):
"""Upload a DataFrame to CARTO. The geometry's CRS must be WGS 84 (EPSG:4326) so you can use it on CARTO.

Args:
Expand All @@ -79,6 +91,11 @@ def to_carto(dataframe, table_name, credentials=None, if_exists='fail', geom_col
uses the name of the index from the dataframe.
cartodbfy (bool, optional): convert the table to CARTO format. Default True. More info
`here <https://carto.com/developers/sql-api/guides/creating-tables/#create-tables>`.
skip_quota_warning (bool, optional): skip the quota exceeded check and force the upload.
(The upload will still fail if the size of the dataset exceeds the remaining DB quota).
Default is False.
retry_times (int, optional):
Number of time to retry the upload in case it fails. Default is 3.

Returns:
string: the table name normalized.
Expand All @@ -102,6 +119,19 @@ def to_carto(dataframe, table_name, credentials=None, if_exists='fail', geom_col

context_manager = ContextManager(credentials)

if not skip_quota_warning:
me_data = context_manager.credentials.me_data
if me_data is not None and me_data.get('user_data'):
n = min(SAMPLE_ROWS_NUMBER, len(dataframe))
estimated_byte_size = len(dataframe.sample(n=n).to_csv(header=False)) * len(dataframe) \
/ n / CSV_TO_CARTO_RATIO
remaining_byte_quota = me_data.get('user_data').get('remaining_byte_quota')

if remaining_byte_quota is not None and estimated_byte_size > remaining_byte_quota:
raise CartoException('DB Quota will be exceeded. '
'The remaining quota is {} bytes and the dataset size is {} bytes.'.format(
remaining_byte_quota, estimated_byte_size))

gdf = GeoDataFrame(dataframe, copy=True)

if index:
Expand All @@ -118,13 +148,23 @@ def to_carto(dataframe, table_name, credentials=None, if_exists='fail', geom_col
gdf.set_geometry(dataframe.geometry.name, inplace=True)

if has_geometry(gdf):
if GEOM_COLUMN_NAME in gdf and dataframe.geometry.name != GEOM_COLUMN_NAME:
gdf.drop(columns=[GEOM_COLUMN_NAME], inplace=True)

# Prepare geometry column for the upload
gdf.rename_geometry(GEOM_COLUMN_NAME, inplace=True)

elif isinstance(dataframe, GeoDataFrame):
log.warning('Geometry column not found in the GeoDataFrame.')

table_name = context_manager.copy_from(gdf, table_name, if_exists, cartodbfy)
chunk_count = math.ceil(estimate_csv_size(gdf) / max_upload_size)
chunk_row_size = int(math.ceil(len(gdf) / chunk_count))
chunked_gdf = [gdf[i:i + chunk_row_size] for i in range(0, gdf.shape[0], chunk_row_size)]

for i, chunk in enumerate(chunked_gdf):
if i > 0:
if_exists = 'append'
table_name = context_manager.copy_from(chunk, table_name, if_exists, cartodbfy, retry_times)

if log_enabled:
log.info('Success! Data uploaded to table "{}" correctly'.format(table_name))
Expand Down Expand Up @@ -358,3 +398,10 @@ def update_privacy_table(table_name, privacy, credentials=None, log_enabled=True

if log_enabled:
log.info('Success! Table "{}" privacy updated correctly'.format(table_name))


def estimate_csv_size(gdf):
n = min(SAMPLE_ROWS_NUMBER, len(gdf))
columns = get_dataframe_columns_info(gdf)
return sum([len(x) for x in
_compute_copy_data(gdf.sample(n=n), columns)]) * len(gdf) / n
Loading