Overview

  1. GCP components:

    • App Engine (PaaS)
    • Compute Engine (IaaS)
    • Cloud Storage (*)
    • Cloud Datastore (*)
    • Cloud SQL (*)
    • BigQuery (*)
  2. Cloud Storage

    • Clobal regional hosting of assets and data
    • Edge caching=Reduced latency=Faster access
    • Guaranteed up-tim of 99.95%
    • Backup and restore options
    • No cap on storage limits
    • Enhanced scurity
    • OAuth 2.0 authentication
    • Group-based access control, ACLs
    • Priced according to amount used
    • Bucket and object-oriented
    • Access
      • API via XML
      • API via JSON
      • online google cloud console
      • cmd line:gsutil
  3. Cloud SQL

    • Relational DB
    • MySQL-based
    • Up to 100GB storage
    • Up to 16GB RAM
    • Automatic db replication to multiple locations
    • Point-in-time backup and recovery
    • Common tool support
      • mysqldump
      • MySQL Wire Protocol
      • JDBC
    • As-needed, instance based
    • Multiple access points:
      • Google Cloud Console interface
      • Standard MySQL connectons
      • MySQL Client
      • JSON API
  4. Cloud DataStore

    • Non-relational DB
    • NoSQL
    • Automatic replication
    • Support ACID transactions for reliable processing
    • Access
      • Google Cloud Console interface
      • cmd line:gcd tool
      • JSON API
  5. BigQuery

    • Analyze massive amounts of data extremely quickly
    • Access via simple UI or REST interface
    • Data storage scales to hundreds of TB
    • Cient APIs
    • SQL dialect
  6. Other evolves:

    • Container Engine
    • Cloud DNS
    • Cloud Pub/Sub
  7. Google storage and data relies on?

    • App Engine (PaaS)
    • Compute Engine (IaaS)

Exploring Cloud Storage

  1. supports ACLs, you can be sure your data is only shared with whom you want to be shared
  2. once massive amouont of data is stored, it can be analyze by BigQuery
  3. supports HTTP protocol
  4. buckets > objects= object itself + object’s metadata (a series of name/value pairs)
  5. handle storage by Cmd line
    • Install Google Cloud SDK first
    • gcloud auth login
    • gcloud cnfig set project {projectID}
    • gsutil
      • gsUtil mb {bucket.name}
      • gsutil ls
      • gsutil ls -L
      • gsUtil ls -l
      • gsUtil ls {bucket.name}
      • gsUtil cp {object.patch} {bucket.name}
      • gsUtil mv {bucket.name.object} {bucket.name1}
      • gsUtil -m acl set -r public-read {bucket.name}
      • gsUtil web set -m index.html -e 404.shtml {bucket.name}
      • … refer offical docs please

Exploring Cloud SQL

  1. Restictions:
    • support MySQL 5.5 or higher
    • instance size limited to 500GB
  2. Import, Export ->.sql
  3. SQL Prompt
    • Google Cloud SQL API
    • 可以直接浏览器输入sql操作Cloud DB
  4. Backup & Restore
  5. Steps on UI
    • create an db by Gloud SQL
    • upload an .sql file to bucket
    • import the .sql file (by bucket path)
    • go to SQL API -> SQL Promt -> put in sql statement

Exploring Cloud DataStore (结构和Elasticsearch类似)

  1. 需要知道的基本概念
    • Entites:
      • Primary DataStore object vs db
      • Has one or more properties vs db column
      • Each property can have one or more values vs db row
      • Properties can have different data types
      • Can be hierachical
  2. Steps:
    • Create an entity
      • add entity name
      • add kind name (table)
      • add property
  3. GCD
    • a tool allows you to create a local Dev Server DataStore for your app
    • and run in administative interface in the browse
    • ./gcd.sh help
    • ./gcd.sh create {prjectId}
    • ./gcd.sh start {prjectId}
    • put data file, egxml file in the preoject/web INF folder
    • ./gcd.sh updateindexes {prjectId}

Exploring BigQuery

  1. What’s BigQuery
    • data analyzer
    • based on OLAP principles (online analytical processing)
    • CSV dta based (comma separated values)
    • reliant on samll mumber of non-related tables
    • it’s NOT db, no inserts, updates or deletions
    • supports joins when one side much smaller than other
  2. BigQuery API
  3. Loading data
    • UI->BigData->BigQuery
    • create a new dataset
    • import data to the dataset
      • schema: eg, name:string,gender:string;frequency:integer,year:integer
  4. Analyze data
    • select the dataset -> click QueryTable
    • input sql statement, eg:
      • select name, gender, year from xxx where gender = ‘F’ and ((year=2003) or (year=2013)) order by frequency DESC limit 10
      • select name, SUM(frequency) as count from xxx group by (name) order by count DESC limit 10
  5. Exporting data
    • download as CSV
    • save as table

Manage storage and data with Python (2.7)

  1. setup dev env:
    • pip install –upgrade google-api-python-client
    • pip install –upgrade httplib2
    • pip install –upgrade argparse
    • Aptana Studio
  2. auth to google storage and store
    • APIs & auth -> Consent screen (policy)
    • Credentials *(For Dev) OAth -> install application -> generate client id ->download json
      • public api access
  3. run your python scritps