I haven’t been able to find great documentation on creating a BigQuery TableSchema using the Java Client Library. This blog post hopes to rectify that :).
You can use the BigQuery sample
code for an idea
of how to create a client connection to BigQuery. Assuming you have the
connection set up you can start by creating a new TableSchema
. The
TableSchema
provides a method for setting the list of fields that make up the
columns of your BigQuery Table. Those columns are defined as an Array of
TableFieldSchema
objects.
ArrayList<TableFieldSchema> fieldSchema = new ArrayList<TableFieldSchema>();
For simple types you can populate your columns with the correct type and mode according to the BigQuery API documentation. For example, to create a STRING field that is NULLABLE you can use the following.
fieldSchema.add(new TableFieldSchema().setName("username").setType("STRING").setMode("NULLABLE"));
And for repeated fields you can use the REPEATED mode.
fieldSchema.add(new TableFieldSchema().setName("email").setType("STRING").setMode("REPEATED"));
To create nested records you specify the parent as a RECORD mode and then call
setFields
for each column of nested data you want to insert. The columns of a
nested type are the same format as for the parent – a list of TableFieldSchema
objects.
fieldSchema.add(
new TableFieldSchema().setName("location").setType("RECORD").setFields(
new ArrayList<TableFieldSchema>() {
{
add(new TableFieldSchema().setName("city").setType("STRING"));
add(new TableFieldSchema().setName("address").setType("STRING"));
add(new TableFieldSchema().setName("zipcode").setType("STRING"));
}
}
)
);
The last step is to set the entire schema as the fields of our table schema.
TableSchema schema = new TableSchema();
schema.setFields(fieldSchema);
Then we set a TableReference
that holds the current project id, dataset id and
table id. We use this TableReference
to create our Table
using the TableSchema
.
TableReference ref = new TableReference();
ref.setProjectId(PROJECT_ID);
ref.setDatasetId("pubsub");
ref.setTableId("review_test");
Table content = new Table();
content.setTableReference(ref);
content.setSchema(schema);
client.tables().insert(ref.getProjectId(), ref.getDatasetId(), content).execute();
Putting this all together gives you a working sample of creating a BigQuery Table using the Java Client Library.
public static void main(String[] args) throws IOException, InterruptedException {
Bigquery client = createAuthorizedClient(); // As per the BQ sample code
ArrayList<TableFieldSchema> fieldSchema = new ArrayList<TableFieldSchema>();
fieldSchema.add(new TableFieldSchema().setName("username").setType("STRING").setMode("NULLABLE"));
fieldSchema.add(new TableFieldSchema().setName("email").setType("STRING").setMode("REPEATED"));
fieldSchema.add(
new TableFieldSchema().setName("location").setType("RECORD").setFields(
new ArrayList<TableFieldSchema>() {
{
add(new TableFieldSchema().setName("city").setType("STRING"));
add(new TableFieldSchema().setName("address").setType("STRING"));
add(new TableFieldSchema().setName("zipcode").setType("STRING"));
}
}));
TableSchema schema = new TableSchema();
schema.setFields(fieldSchema);
TableReference ref = new TableReference();
ref.setProjectId("<YOUR_PROJECT_ID>");
ref.setDatasetId("<YOUR_DATASET_ID>");
ref.setTableId("<YOUR_TABLE_ID>");
Table content = new Table();
content.setTableReference(ref);
content.setSchema(schema);
client.tables().insert(ref.getProjectId(), ref.getDatasetId(), content).execute();
}