I haven’t been able to find great documentation on creating a BigQuery
TableSchema using the Java Client Library. This blog post hopes to rectify that
:).
You can use the BigQuery sample
code for an idea
of how to create a client connection to BigQuery. Assuming you have the
connection set up you can start by creating a new TableSchema
. The
TableSchema
provides a method for setting the list of fields that make up the
columns of your BigQuery Table. Those columns are defined as an Array of
TableFieldSchema
objects.
1
|
ArrayList<TableFieldSchema> fieldSchema = new ArrayList<TableFieldSchema>();
|
For simple types you can populate your columns with the correct type and mode
according to the BigQuery API
documentation.
For example, to create a STRING field that is NULLABLE you can use the
following.
1
|
fieldSchema.add(new TableFieldSchema().setName("username").setType("STRING").setMode("NULLABLE"));
|
And for repeated fields you can use the REPEATED mode.
1
|
fieldSchema.add(new TableFieldSchema().setName("email").setType("STRING").setMode("REPEATED"));
|
To create nested records you specify the parent as a RECORD mode and then call
setFields
for each column of nested data you want to insert. The columns of a
nested type are the same format as for the parent – a list of TableFieldSchema
objects.
1
2
3
4
5
6
7
8
9
10
11
|
fieldSchema.add(
new TableFieldSchema().setName("location").setType("RECORD").setFields(
new ArrayList<TableFieldSchema>() {
{
add(new TableFieldSchema().setName("city").setType("STRING"));
add(new TableFieldSchema().setName("address").setType("STRING"));
add(new TableFieldSchema().setName("zipcode").setType("STRING"));
}
}
)
);
|
The last step is to set the entire schema as the fields of our table schema.
1
2
|
TableSchema schema = new TableSchema();
schema.setFields(fieldSchema);
|
Then we set a TableReference
that holds the current project id, dataset id and
table id. We use this TableReference
to create our Table
using the TableSchema
.
1
2
3
4
5
6
7
8
9
10
|
TableReference ref = new TableReference();
ref.setProjectId(PROJECT_ID);
ref.setDatasetId("pubsub");
ref.setTableId("review_test");
Table content = new Table();
content.setTableReference(ref);
content.setSchema(schema);
client.tables().insert(ref.getProjectId(), ref.getDatasetId(), content).execute();
|
Putting this all together gives you a working sample of creating a BigQuery Table using the Java Client Library.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
|
public static void main(String[] args) throws IOException, InterruptedException {
Bigquery client = createAuthorizedClient(); // As per the BQ sample code
ArrayList<TableFieldSchema> fieldSchema = new ArrayList<TableFieldSchema>();
fieldSchema.add(new TableFieldSchema().setName("username").setType("STRING").setMode("NULLABLE"));
fieldSchema.add(new TableFieldSchema().setName("email").setType("STRING").setMode("REPEATED"));
fieldSchema.add(
new TableFieldSchema().setName("location").setType("RECORD").setFields(
new ArrayList<TableFieldSchema>() {
{
add(new TableFieldSchema().setName("city").setType("STRING"));
add(new TableFieldSchema().setName("address").setType("STRING"));
add(new TableFieldSchema().setName("zipcode").setType("STRING"));
}
}));
TableSchema schema = new TableSchema();
schema.setFields(fieldSchema);
TableReference ref = new TableReference();
ref.setProjectId("<YOUR_PROJECT_ID>");
ref.setDatasetId("<YOUR_DATASET_ID>");
ref.setTableId("<YOUR_TABLE_ID>");
Table content = new Table();
content.setTableReference(ref);
content.setSchema(schema);
client.tables().insert(ref.getProjectId(), ref.getDatasetId(), content).execute();
}
|
See also