Skip to content

Commit

Permalink
Update BigQuery samples
Browse files Browse the repository at this point in the history
  • Loading branch information
jmdobry committed Oct 26, 2016
1 parent ba151d5 commit 68fbcb3
Show file tree
Hide file tree
Showing 20 changed files with 958 additions and 1,904 deletions.
85 changes: 43 additions & 42 deletions bigquery/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,22 +35,21 @@ __Usage:__ `node datasets --help`

```
Commands:
create <datasetId> Create a new dataset with the specified ID.
delete <datasetId> Delete the dataset with the specified ID.
list List datasets in the specified project.
size <datasetId> Calculate the size of the specified dataset.
create <datasetId> Creates a new dataset.
delete <datasetId> Deletes a dataset.
list [projectId] Lists all datasets in the specified project or the current project.
size <datasetId> [projectId] Calculates the size of a dataset.
Options:
--projectId, -p Optionally specify the project ID to use. [string] [default: "nodejs-docs-samples"]
--help Show help [boolean]
--help Show help [boolean]
Examples:
node datasets create my_dataset Create a new dataset with the ID "my_dataset".
node datasets delete my_dataset Delete a dataset identified as "my_dataset".
node datasets list List datasets.
node datasets list -p bigquery-public-data List datasets in the "bigquery-public-data" project.
node datasets size my_dataset Calculate the size of "my_dataset".
node datasets size hacker_news -p bigquery-public-data Calculate the size of "bigquery-public-data:hacker_news".
node datasets create my_dataset Creates a new dataset named "my_dataset".
node datasets delete my_dataset Deletes a dataset named "my_dataset".
node datasets list Lists all datasets in the current project.
node datasets list bigquery-public-data Lists all datasets in the "bigquery-public-data" project.
node datasets size my_dataset Calculates the size of "my_dataset" in the current project.
node datasets size hacker_news bigquery-public-data Calculates the size of "bigquery-public-data:hacker_news".
For more information, see https://cloud.google.com/bigquery/docs
```
Expand All @@ -68,17 +67,17 @@ __Usage:__ `node queries --help`
Commands:
sync <sqlQuery> Run the specified synchronous query.
async <sqlQuery> Start the specified asynchronous query.
wait <jobId> Wait for the specified job to complete and retrieve its results.
shakespeare Queries a public Shakespeare dataset.
Options:
--help Show help [boolean]
Examples:
node queries sync "SELECT * FROM
`publicdata.samples.natality` LIMIT 5;"
node queries async "SELECT * FROM
`publicdata.samples.natality` LIMIT 5;"
node queries wait job_VwckYXnR8yz54GBDMykIGnrc2
node queries sync "SELECT * FROM publicdata.samples.natality Synchronously queries the natality dataset.
LIMIT 5;"
node queries async "SELECT * FROM Queries the natality dataset as a job.
publicdata.samples.natality LIMIT 5;"
node queries shakespeare Queries a public Shakespeare dataset.
For more information, see https://cloud.google.com/bigquery/docs
```
Expand All @@ -94,39 +93,41 @@ __Usage:__ `node tables --help`

```
Commands:
create <datasetId> <tableId> Create a new table with the specified ID in the
specified dataset.
list <datasetId> List tables in the specified dataset.
delete <datasetId> <tableId> Delete the specified table from the specified dataset.
copy <srcDatasetId> <srcTableId> <destDatasetId> Make a copy of an existing table.
<destTableId>
browse <datasetId> <tableId> List the rows from the specified table.
import <datasetId> <tableId> <fileName> Import data from a local file or a Google Cloud Storage
file into the specified table.
create <datasetId> <tableId> <schema> [projectId] Creates a new table.
list <datasetId> [projectId] Lists all tables in a dataset.
delete <datasetId> <tableId> [projectId] Deletes a table.
copy <srcDatasetId> <srcTableId> <destDatasetId> Makes a copy of a table.
<destTableId> [projectId]
browse <datasetId> <tableId> [projectId] Lists rows in a table.
import <datasetId> <tableId> <fileName> [projectId] Imports data from a local file into a table.
import-gcs <datasetId> <tableId> <bucketName> <fileName> Imports data from a Google Cloud Storage file into a
[projectId] table.
export <datasetId> <tableId> <bucketName> <fileName> Export a table from BigQuery to Google Cloud Storage.
insert <datasetId> <tableId> <json_or_file> Insert a JSON array (as a string or newline-delimited
[projectId]
insert <datasetId> <tableId> <json_or_file> [projectId] Insert a JSON array (as a string or newline-delimited
file) into a BigQuery table.
Options:
--help Show help [boolean]
Examples:
node tables create my_dataset my_table Create table "my_table" in "my_dataset".
node tables list my_dataset List tables in "my_dataset".
node tables browse my_dataset my_table Display rows from "my_table" in "my_dataset".
node tables delete my_dataset my_table Delete "my_table" from "my_dataset".
node tables import my_dataset my_table ./data.csv Import a local file into a table.
node tables import my_dataset my_table data.csv --bucket Import a GCS file into a table.
my-bucket
node tables export my_dataset my_table my-bucket my-file Export my_dataset:my_table to gcs://my-bucket/my-file as
raw CSV.
node tables export my_dataset my_table my-bucket my-file -f Export my_dataset:my_table to gcs://my-bucket/my-file as
JSON --gzip gzipped JSON.
node tables insert my_dataset my_table json_string Insert the JSON array represented by json_string into
node tables create my_dataset my_table "Name:string, Createss a new table named "my_table" in "my_dataset".
Age:integer, Weight:float, IsMagic:boolean"
node tables list my_dataset Lists tables in "my_dataset".
node tables browse my_dataset my_table Displays rows from "my_table" in "my_dataset".
node tables delete my_dataset my_table Deletes "my_table" from "my_dataset".
node tables import my_dataset my_table ./data.csv Imports a local file into a table.
node tables import-gcs my_dataset my_table my-bucket Imports a GCS file into a table.
data.csv
node tables export my_dataset my_table my-bucket my-file Exports my_dataset:my_table to gcs://my-bucket/my-file
as raw CSV.
node tables export my_dataset my_table my-bucket my-file -f Exports my_dataset:my_table to gcs://my-bucket/my-file
JSON --gzip as gzipped JSON.
node tables insert my_dataset my_table json_string Inserts the JSON array represented by json_string into
my_dataset:my_table.
node tables insert my_dataset my_table json_file Insert the JSON objects contained in json_file (one per
node tables insert my_dataset my_table json_file Inserts the JSON objects contained in json_file (one per
line) into my_dataset:my_table.
node tables copy src_dataset src_table dest_dataset Copy src_dataset:src_table to dest_dataset:dest_table.
node tables copy src_dataset src_table dest_dataset Copies src_dataset:src_table to dest_dataset:dest_table.
dest_table
For more information, see https://cloud.google.com/bigquery/docs
Expand Down
215 changes: 95 additions & 120 deletions bigquery/datasets.js
Original file line number Diff line number Diff line change
@@ -1,161 +1,136 @@
// Copyright 2016, Google, Inc.
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
/**
* Copyright 2016, Google, Inc.
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

'use strict';

// [START setup]
// By default, the client will authenticate using the service account file
// specified by the GOOGLE_APPLICATION_CREDENTIALS environment variable and use
// the project specified by the GCLOUD_PROJECT environment variable. See
// https://googlecloudplatform.github.io/google-cloud-node/#/docs/google-cloud/latest/guides/authentication
var BigQuery = require('@google-cloud/bigquery');
// [END setup]

function createDataset (datasetId, callback) {
var bigquery = BigQuery();
var dataset = bigquery.dataset(datasetId);

// See https://googlecloudplatform.github.io/google-cloud-node/#/docs/bigquery/latest/bigquery/dataset?method=create
dataset.create(function (err, dataset, apiResponse) {
if (err) {
return callback(err);
}

console.log('Created dataset: %s', datasetId);
return callback(null, dataset, apiResponse);
});
const BigQuery = require('@google-cloud/bigquery');

// [START bigquery_create_dataset]
function createDataset (datasetId) {
// Instantiates a client
const bigquery = BigQuery();

// Creates a new dataset, e.g. "my_new_dataset"
return bigquery.createDataset(datasetId)
.then((results) => {
const dataset = results[0];
console.log(`Dataset ${dataset.id} created.`);
return dataset;
});
}
// [END bigquery_create_dataset]

function deleteDataset (datasetId, callback) {
var bigquery = BigQuery();
var dataset = bigquery.dataset(datasetId);
// [START bigquery_delete_dataset]
function deleteDataset (datasetId) {
// Instantiates a client
const bigquery = BigQuery();

// See https://googlecloudplatform.github.io/google-cloud-node/#/docs/bigquery/latest/bigquery/dataset?method=delete
dataset.delete(function (err) {
if (err) {
return callback(err);
}
// References an existing dataset, e.g. "my_dataset"
const dataset = bigquery.dataset(datasetId);

console.log('Deleted dataset: %s', datasetId);
return callback(null);
});
// Deletes the dataset
return dataset.delete()
.then(() => {
console.log(`Dataset ${dataset.id} deleted.`);
});
}
// [END bigquery_delete_dataset]

function listDatasets (projectId, callback) {
var bigquery = BigQuery({
// [START bigquery_list_datasets]
function listDatasets (projectId) {
// Instantiates a client
const bigquery = BigQuery({
projectId: projectId
});

// See https://googlecloudplatform.github.io/google-cloud-node/#/docs/bigquery/latest/bigquery?method=getDatasets
bigquery.getDatasets(function (err, datasets) {
if (err) {
return callback(err);
}

console.log('Found %d dataset(s)!', datasets.length);
return callback(null, datasets);
});
// Lists all datasets in the specified project
return bigquery.getDatasets()
.then((results) => {
const datasets = results[0];
console.log('Datasets:');
datasets.forEach((dataset) => console.log(dataset.id));
return datasets;
});
}
// [END bigquery_list_datasets]

// [START get_dataset_size]
// Control-flow helper library
var async = require('async');

function getDatasetSize (datasetId, projectId, callback) {
// Instantiate a bigquery client
var bigquery = BigQuery({
// [START bigquery_get_dataset_size]
function getDatasetSize (datasetId, projectId) {
// Instantiate a client
const bigquery = BigQuery({
projectId: projectId
});
var dataset = bigquery.dataset(datasetId);

// See https://googlecloudplatform.github.io/google-cloud-node/#/docs/bigquery/latest/bigquery/dataset?method=getTables
dataset.getTables(function (err, tables) {
if (err) {
return callback(err);
}

return async.map(tables, function (table, cb) {
// Fetch more detailed info for each table
// See https://googlecloudplatform.github.io/google-cloud-node/#/docs/bigquery/latest/bigquery/table?method=get
table.get(function (err, tableInfo) {
if (err) {
return cb(err);
}
// Return numBytes converted to Megabytes
var numBytes = tableInfo.metadata.numBytes;
return cb(null, (parseInt(numBytes, 10) / 1000) / 1000);
});
}, function (err, sizes) {
if (err) {
return callback(err);
}
var sum = sizes.reduce(function (cur, prev) {
return cur + prev;
}, 0);

console.log('Size of %s: %d MB', datasetId, sum);
return callback(null, sum);

// References an existing dataset, e.g. "my_dataset"
const dataset = bigquery.dataset(datasetId);

// Lists all tables in the dataset
return dataset.getTables()
.then((results) => results[0])
// Retrieve the metadata for each table
.then((tables) => Promise.all(tables.map((table) => table.get())))
.then((results) => results.map((result) => result[0]))
// Select the size of each table
.then((tables) => tables.map((table) => (parseInt(table.metadata.numBytes, 10) / 1000) / 1000))
// Sum up the sizes
.then((sizes) => sizes.reduce((cur, prev) => cur + prev, 0))
// Print and return the size
.then((sum) => {
console.log(`Size of ${dataset.id}: ${sum} MB`);
return sum;
});
});
}
// [END get_dataset_size]
// [END bigquery_get_dataset_size]

// The command-line program
var cli = require('yargs');
var makeHandler = require('../utils').makeHandler;
const cli = require(`yargs`);

var program = module.exports = {
const program = module.exports = {
createDataset: createDataset,
deleteDataset: deleteDataset,
listDatasets: listDatasets,
getDatasetSize: getDatasetSize,
main: function (args) {
main: (args) => {
// Run the command-line program
cli.help().strict().parse(args).argv;
}
};

cli
.demand(1)
.command('create <datasetId>', 'Create a new dataset with the specified ID.', {}, function (options) {
program.createDataset(options.datasetId, makeHandler());
})
.command('delete <datasetId>', 'Delete the dataset with the specified ID.', {}, function (options) {
program.deleteDataset(options.datasetId, makeHandler());
.command(`create <datasetId>`, `Creates a new dataset.`, {}, (opts) => {
program.createDataset(opts.datasetId);
})
.command('list', 'List datasets in the specified project.', {}, function (options) {
program.listDatasets(options.projectId, makeHandler(true, 'id'));
.command(`delete <datasetId>`, `Deletes a dataset.`, {}, (opts) => {
program.deleteDataset(opts.datasetId);
})
.command('size <datasetId>', 'Calculate the size of the specified dataset.', {}, function (options) {
program.getDatasetSize(options.datasetId, options.projectId, makeHandler());
.command(`list [projectId]`, `Lists all datasets in the specified project or the current project.`, {}, (opts) => {
program.listDatasets(opts.projectId || process.env.GCLOUD_PROJECT);
})
.option('projectId', {
alias: 'p',
requiresArg: true,
type: 'string',
default: process.env.GCLOUD_PROJECT,
description: 'Optionally specify the project ID to use.',
global: true
.command(`size <datasetId> [projectId]`, `Calculates the size of a dataset.`, {}, (opts) => {
program.getDatasetSize(opts.datasetId, opts.projectId || process.env.GCLOUD_PROJECT);
})
.example('node $0 create my_dataset', 'Create a new dataset with the ID "my_dataset".')
.example('node $0 delete my_dataset', 'Delete a dataset identified as "my_dataset".')
.example('node $0 list', 'List datasets.')
.example('node $0 list -p bigquery-public-data', 'List datasets in the "bigquery-public-data" project.')
.example('node $0 size my_dataset', 'Calculate the size of "my_dataset".')
.example('node $0 size hacker_news -p bigquery-public-data', 'Calculate the size of "bigquery-public-data:hacker_news".')
.example(`node $0 create my_dataset`, `Creates a new dataset named "my_dataset".`)
.example(`node $0 delete my_dataset`, `Deletes a dataset named "my_dataset".`)
.example(`node $0 list`, `Lists all datasets in the current project.`)
.example(`node $0 list bigquery-public-data`, `Lists all datasets in the "bigquery-public-data" project.`)
.example(`node $0 size my_dataset`, `Calculates the size of "my_dataset" in the current project.`)
.example(`node $0 size hacker_news bigquery-public-data`, `Calculates the size of "bigquery-public-data:hacker_news".`)
.wrap(120)
.recommendCommands()
.epilogue('For more information, see https://cloud.google.com/bigquery/docs');
.epilogue(`For more information, see https://cloud.google.com/bigquery/docs`);

if (module === require.main) {
program.main(process.argv.slice(2));
Expand Down
Loading

0 comments on commit 68fbcb3

Please sign in to comment.