Showing posts with label Hadoop for Microsoft Windows. Show all posts
Showing posts with label Hadoop for Microsoft Windows. Show all posts

Thursday, December 22, 2011

Cloud (Azure) Data and Storage

Great presentation by Dave Campbell on Cloud Data and Storage . I’ve highlighted my learning below but be sure to check out the presentation. Hadoop for Microsoft Azure looks promising.

We have 3 storage options in Azure: Blog, Table and SQL Azure

image

Azure Storage options:

image

SQL Storage options:

image

Everything is available at the portal:

image

Blob:

Is contained within a hierarchical namespace with a RESTful interface. You can GET and PUT Blobs. There are two types of Blobs: Page and Block. The RESTful interace allows you to build client tools/libraries against the Azure Storage Service.

Table Storage:

In Azure Table storage is a service – collection of tables. Table storage are simple, REST accessible, highly scalable and cost effective.

SQL Azure:

SQL Server database technology is delivered as a service on Windows Azure Platform ideal for both simple and complex applications that is enterprise ready and designed to scale out elastically with demand. The advantages are you are no longer to physically manage these servers. SQL Azure offers a pay as go service:

image

Clients:

SQL Azure provides thin client and thick client to manage your SQL Azure database.

Thin client (browser):

image

Thick Client:

You can use SQL Server Administrator to connect to a SQN Azure DB:

image

or connect using Visual Studio 2010:

image

The cool thing is Visual Studio lets you deploy an Azure database easily in a few steps:

Step 1: Create a Project from your local database:

image

image

Step 2: Set the Target Platform

image

Step 3: Publish to SQL Azure

image

SQL Federations allows scaling our your database

  • Integrated database sharding that can scale to hundreds of nodes
  • Multi-tenancy via flexible repartioning
  • Online split operations to minimize downtime
  • Automatic data discovery regardless of changes in how data is partitioned

What is Database Sharding?

Database sharding is a method of horizontal partitioning in a database or search engine. Each individual partition is referred to as a shard or database shard.

You must enable Federation on tables with the below SQL:

image

image

Federated Databases can be managed using the think client:

image

image

Note the results of a stress test conducted on a Federated VS Non-Federated database.

image

Federated database let you distribute IO load and ability to manage space.

Hadoop on Azure:

Microsoft is working with the Apache community to get a Hadoop version for Microsoft Windows. The Developer Preview for the Apache Hadoop- based Services for Windows Azure is available here https://www.hadooponazure.com/

image

Interactive mode: No tools to be installed

image

cat is the command to look at the contents of a file

image

# is the command to look at all the commands available

image

image

The query to get the word could on some of the Gutenberg projects is below:

image

When you run the query, the Hadoop Job Scheduler will automatically send the job (query) out to the cluster with one process for each of the above files.

from(“gutenberg”) – gets the data from your dataset

mapReduce(“WordCount.js”, “word, count:long”) – map reduce and pass your Word Count function

orderBy(“count DESC”) – order results by descending

take(10) – reduction to the top 10 words

Hadoop for Azure lets you plot graphs based on the data you have:

image

image

Hadoop supports multiple languages such as: Python, Ruby and .NET

What is Hive?

image

Hive is structured data that sits over the Hadoop layer (unstructured data). Note the Hive query syntax below:

image

ShareThis

Meebo

Labels

2012 (1) Adobe Reader (1) Analysis Services (4) Analytic Functions (1) APEX Web Service (1) Apple (1) Aptana (1) Architecture (1) ASP.NET MVC3 (2) Azure (3) Azure Data Market (1) Azure Data Marketplace (4) Best Practices Analyzer (1) Best States in America (US) for Business–CNBC–Excel 2013 Web App Interactive (1) BI for .NET (1) Big Data (2) Bluefish (1) Boise (1) Butternut (1) C# (3) C# Introduction (1) Chow (1) Chrome (1) Chrome Remote Desktop (1) Chromium (1) Cliplets (1) Cloud 9 IDE (1) Cloudon MS Office iPad app (1) Code (1) CodeAcademy (1) CoffeeCup (1) Command (1) communicator (1) Community Edition (1) compiz (1) Contracts (1) Country Region Reference Data (1) Creating presentations using deck.js (1) Crescent (1) Data Explorer (2) Data Mining (2) Data Modeling (1) Data Quality (1) Data Quality Service (1) DataTables (1) Date (1) DAX (1) deck.js (1) Decrypt Files (1) Demo (1) Denali (6) Detect Categories (1) Dropbox (1) Eclipse IDE for Java EE Developers (1) Editor (1) Enter password for keyring 'default' to unlock on Ubuntu (1) Entity Framework 4.0 (1) error (1) ERwin (3) excel (1) Excel 2010 (2) Excel 2013 (1) Excel Services (1) Filezila (1) Format (1) Free (5) free ebooks (2) Free training (2) Fusion Tables (1) Getaround.com (1) Google (2) Google Cloud Connect for Microsoft Office (1) Google Currents (1) Google Docs (1) Google Graph Calculator (1) Hadoop (2) Hadoop for Microsoft Windows (1) Highlight (1) Hive (2) How to embed a Google Book Preview in Blogger? (1) How to embed word (1) HP Envy Spectre (1) HP Z1 Workstation (1) HTML (1) HTML-Kit (1) HTML5 (1) HTML5 animation using Mugeda tool (1) HTML5 Video (1) iBooks (1) Improvement (1) Inspiration (1) Install (1) Install Oracle SH schema (1) Installation (1) Installing (1) Installing Android Apps on Windows using bluestacks (1) Installing EntityFrameWork (1) Integration Services (1) iOS5 (1) iPad (2) iPhone (1) Issue (1) JavaScript (7) jQuery (5) Kinect (1) Knockout.js (1) Komodo Edit (1) Komodo IDE (1) Kompozer (1) Language Change (1) Learn SQL Server 2012 (1) Learn Windows Azure (1) Lego Building Blocks (1) Lifebrowser (1) LightSwitch (2) LINQ Oracle (1) Managing Project (1) mds (1) Menus using CSS3 (1) Microsoft (2) Microsoft BI (6) Microsoft Research (1) Microsoft's Future Vision on Productivity (1) Model first (1) mongodb (1) Motivation (2) Movie (1) MS Office (1) MSDN (1) NASA (1) nautilus-open-terminal (1) Netbeans (1) Netflix (2) Notepad++ (1) NuGet Package Manager (1) OBIEE (4) OData (2) ODP.NET (1) Office (2) Office 2013 (1) Office 2013 Quick Start Guides (1) Office Diagnostics (1) Office Tips and Tricks (1) Office Touch Guide (1) Office365 (2) Omnibox (1) Open ssh (1) ORA Errors (1) ORA-12514 (1) Oracle (1) Oracle 11g Express Edition Options Not Included (1) Oracle 11g Invisible Indexes (1) Oracle 11g New Features (1) Oracle 11g Read Only tables (1) Oracle APEX (2) Oracle Cloud (1) Oracle Data Pump (1) Oracle Express (1) Oracle Express 11g (2) oracle java 7 (1) Oracle Learning Library (1) Oracle OData (1) Oracle Open World 2011 (1) Oracle Pivot and Unpivot (1) Oracle Prebuilt VM Appliances (1) Oracle SQL Developer (1) Oracle Virtual Columns (1) Oracle XML DB (1) Outlook 2010 (3) Performance (1) PerformancePoint Services (2) Personal (1) Phonegap (1) Picture (1) PluralSight (1) Power View (2) PowerPivot (14) powerpoint in a site or a blog (1) PowerShell (1) Prettyprint (1) prezi (1) Project Juneau (1) Python (1) Rainbow (1) Recipe (1) Recorder (1) Rent (1) Reporting Services (2) Resources (1) RIA Services (1) Ruby (1) Samsung Galaxy S III (ATT) with Android 4.1 Jelly Bean (2) Screen Capture (1) Search (2) Security (1) Self Service BI (4) Server Error (1) Seth Godin (1) Setup (1) SharePoint (1) SharePoint 2010 (9) SharePoint 2010 Training (1) SharePoint BI Performance Point (3) Shell Scripting (1) Shutter (1) Silverlight (1) Skydrive (1) SOA (2) Social Network (1) Soup (1) SP2-0750 (1) Spotify (1) SQL Azure (1) SQL Server 2008 R2 (1) SQL Server 2008 R2 Reporting Services (1) SQL Server 2012 (5) sql server denali data quality services (1) SQL Server Virtual Labs (2) SQL*Plus error (1) Squash (1) SSAS Tabular (1) Storing and Retrieving XML data in Oracle 11g (1) Tabular (1) TechEd 2012 (1) TED (1) Terminator (1) Texmaker (1) Thunderstorm (1) TimeTo365.com (1) Tip (1) Tips (1) Tips and Tricks (1) Training (1) Ubuntu (1) Ubuntu 11.04 (1) Ubuntu Oracle (1) unrar (1) Vertica (2) Vertica VM appliance (1) video.js (1) VirtualBox (1) Visio 2010 crashes (1) Visio Services (2) Visual Studio 11 Developer Preview Training Kit (1) Visual Studio 2010 (1) Visual Studio 2010 and .NET Framework 4 Training Kit (1) Visual Studio Express 2012 (1) VLC (1) VM (1) WCF (1) Web Stack (1) WebMatrix (1) Webucator (1) What is new in SQL Server 2012 Analysis Services (1) Whole Foods Boise ID (1) Why won’t Microsoft Lync 2010 start (1) Windows 2008R2 SP1 (1) Windows 7 (1) Windows 8 Charms (1) Windows 8 Consumer Preview Demo (1) Windows 8 Contracts (1) Windows 8 Metro Apps (3) Windows 8 Touch Language (1) Windows 8 Video Driver issue/problem (1) Windows App Store (1) windows azure (3) Windows Phone 7 (1) Windows Phone Development (1) Wine (1) Word 2010 (1) Word 2013 (1) Word PDF Reflow (1) Work (1) zip (1) ZoomIt v4.2 (1)

Disclaimer

This blog does not represent the thoughts, intentions, plans or strategies of my employer. It is solely my opinion. My blog comes with no guarantees, and the content might contain errors. Expect to find a repeat of information that you can find in other blogs and sites. This is mainly for my future reference, It is my way of documenting things. I give due credit to contents, images, information sourced from product demos and other external sources wherever possible.