Big data sources, definitions, scales. Paradigms for doing science. Relational Database technology: Principles of relational databases, operations in relational databases (including an introduction to SQL), database design and discussion of normal forms, writing queries. MapReduce and Parallel / Distributed Algorithms: Map-Reduce model and programming, parallel and distributed query processing, No-SQL databases: Data organization at large scale, ACID and its costs, distributed consensus problem, current implementations (BigTable, LH*, DynamoDB, Pig, Spark, etc.) Analyzing graphs: representation and analysis of networks abstracted as graphs, PageRank.