Cloud Computing - 初识5 Google Storage and Data
- Overview
- Exploring Cloud Storage
- Exploring Cloud SQL
- Exploring Cloud DataStore (结构和Elasticsearch类似)
- Exploring BigQuery
- Manage storage and data with Python (2.7)
Overview
GCP components:
- App Engine (PaaS)
- Compute Engine (IaaS)
- Cloud Storage (*)
- Cloud Datastore (*)
- Cloud SQL (*)
- BigQuery (*)
Cloud Storage
- Clobal regional hosting of assets and data
- Edge caching=Reduced latency=Faster access
- Guaranteed up-tim of 99.95%
- Backup and restore options
- No cap on storage limits
- Enhanced scurity
- OAuth 2.0 authentication
- Group-based access control, ACLs
- Priced according to amount used
- Bucket and object-oriented
- Access
- API via XML
- API via JSON
- online google cloud console
- cmd line:gsutil
Cloud SQL
- Relational DB
- MySQL-based
- Up to 100GB storage
- Up to 16GB RAM
- Automatic db replication to multiple locations
- Point-in-time backup and recovery
- Common tool support
- mysqldump
- MySQL Wire Protocol
- JDBC
- As-needed, instance based
- Multiple access points:
- Google Cloud Console interface
- Standard MySQL connectons
- MySQL Client
- JSON API
Cloud DataStore
- Non-relational DB
- NoSQL
- Automatic replication
- Support ACID transactions for reliable processing
- Access
- Google Cloud Console interface
- cmd line:gcd tool
- JSON API
BigQuery
- Analyze massive amounts of data extremely quickly
- Access via simple UI or REST interface
- Data storage scales to hundreds of TB
- Cient APIs
- SQL dialect
Other evolves:
- Container Engine
- Cloud DNS
- Cloud Pub/Sub
Google storage and data relies on?
- App Engine (PaaS)
- Compute Engine (IaaS)
Exploring Cloud Storage
- supports ACLs, you can be sure your data is only shared with whom you want to be shared
- once massive amouont of data is stored, it can be analyze by BigQuery
- supports HTTP protocol
- buckets > objects= object itself + object’s metadata (a series of name/value pairs)
- handle storage by Cmd line
- Install Google Cloud SDK first
- gcloud auth login
- gcloud cnfig set project {projectID}
- gsutil
- gsUtil mb {bucket.name}
- gsutil ls
- gsutil ls -L
- gsUtil ls -l
- gsUtil ls {bucket.name}
- gsUtil cp {object.patch} {bucket.name}
- gsUtil mv {bucket.name.object} {bucket.name1}
- gsUtil -m acl set -r public-read {bucket.name}
- gsUtil web set -m index.html -e 404.shtml {bucket.name}
- … refer offical docs please
Exploring Cloud SQL
- Restictions:
- support MySQL 5.5 or higher
- instance size limited to 500GB
- …
- Import, Export ->.sql
- SQL Prompt
- Google Cloud SQL API
- 可以直接浏览器输入sql操作Cloud DB
- Backup & Restore
- Steps on UI
- create an db by Gloud SQL
- upload an .sql file to bucket
- import the .sql file (by bucket path)
- go to SQL API -> SQL Promt -> put in sql statement
Exploring Cloud DataStore (结构和Elasticsearch类似)
- 需要知道的基本概念
- Entites:
- Primary DataStore object vs db
- Has one or more properties vs db column
- Each property can have one or more values vs db row
- Properties can have different data types
- Can be hierachical
- Entites:
- Steps:
- Create an entity
- add entity name
- add kind name (table)
- add property
- Create an entity
- GCD
- a tool allows you to create a local Dev Server DataStore for your app
- and run in administative interface in the browse
- ./gcd.sh help
- ./gcd.sh create {prjectId}
- ./gcd.sh start {prjectId}
- put data file, egxml file in the preoject/web INF folder
- ./gcd.sh updateindexes {prjectId}
Exploring BigQuery
- What’s BigQuery
- data analyzer
- based on OLAP principles (online analytical processing)
- CSV dta based (comma separated values)
- reliant on samll mumber of non-related tables
- it’s NOT db, no inserts, updates or deletions
- supports joins when one side much smaller than other
- BigQuery API
- Loading data
- UI->BigData->BigQuery
- create a new dataset
- import data to the dataset
- schema: eg, name:string,gender:string;frequency:integer,year:integer
- Analyze data
- select the dataset -> click QueryTable
- input sql statement, eg:
- select name, gender, year from xxx where gender = ‘F’ and ((year=2003) or (year=2013)) order by frequency DESC limit 10
- select name, SUM(frequency) as count from xxx group by (name) order by count DESC limit 10
- Exporting data
- download as CSV
- save as table
Manage storage and data with Python (2.7)
- setup dev env:
- pip install –upgrade google-api-python-client
- pip install –upgrade httplib2
- pip install –upgrade argparse
- Aptana Studio
- auth to google storage and store
- APIs & auth -> Consent screen (policy)
- Credentials
*(For Dev) OAth -> install application -> generate client id ->download json
- public api access
- run your python scritps