Greenplum MPP Query Execution

To learn new Database, I always prefer to start with How it executes the given SQL/Task? For Greenplum MPP database, here is my finding on How it works:

1. Client connect to Postmaster process

2. Postmaster process spawns a background worker process, Query Dispatcher (QD)

3. Then Client submits the SQL’s for execution to QD

4. Query Dispatcher (QD) : one who,

a. Works only on master as driving and coordination process

b. Takes care of Optimizing the SQL using catalog data

c. Create execution Plan

d. Write the changes, DTM context to WAL

e. Co-ordinate Distributed transaction (DTM)

5. QD Calls segment process for execution, Query Executer( QE) and submits the execution plan to QE

a. Query Executer( QE), is segment side worker process who is responsible for Query execution on each of the segment node

b. Gang communication across the segments

c. Send final result set to Master QD

6. SQL Execution : QE takes the execution plan tree and start working on it by using local catalog data, buffer cache, disk IO ..etc

7. Gang communication : since each of the segment works on given set of data, they needs to communicate each other on who is doing what. Also share the data for Joins through motions

8. Once all the segments are done with execution, results are submitted to master. Master does aggregation and returns it to client.

Comments

AnonymousJune 28, 2013 at 11:45 PM
Good place. I like it a lot… but why is it so brief?

Also visit my blog - freiwillige krankenversicherung kündigen
ReplyDelete
Replies
Santosh KanganeJune 29, 2013 at 2:33 PM
As this is first post of GPDB, thought of keeping it brief and crispy !!!
ReplyDelete
Replies

Santosh Kangane

Search This Blog

Greenplum MPP Query Execution

Labels

Comments

Post a Comment

Popular posts from this blog

Drop all Objects from Schema In Postgres

Distributed transaction in Oracle ( Over Oracle DBLink)

Vacuum Analyze Full Schema in PostgreSQL Database