วันศุกร์ที่ 12 ธันวาคม พ.ศ. 2557

BI VS. Big Data Analytics


















    We don't need to convert semi/unstructured data to structured data. They have their own storing and analytics ways.
    • Structured (retail, financial, bioinformatics, geodata)
    • Semi-structured (web logs, email, documents) has data structure not conforming to existing data models like RDB, OODB.
    • Unstructured (images, video, sensor data, web pages)
     Structured data VS. Semi-/Unstructured data:



















     Design pattern for the operationalized data lake
    source: https://www.mongodb.com/hadoop-and-mongodb
















































    database shard is a horizontal partition of data in a database or search engine. Each individual partition is referred to as a shard or database shard. Each shard is held on a separate database server instance, to spread load.

    Big data warehouse sw : http://tajo.apache.org