LucidDbHorizontalPartitioning and Firewater

classic Classic list List threaded Threaded
8 messages Options
OG
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

LucidDbHorizontalPartitioning and Firewater

OG
Hello,

I read http://pub.eigenbase.org/wiki/LucidDbHorizontalPartitioning , but I also saw references to Firewater.  Is Firewater meant to serve as (an alternative?) mechanism to partition large LucidDB databases/tables over N servers, yet provide a unified view of them, so that even joins work across partitions living on different servers?

Thanks,
Otis


------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
luciddb-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/luciddb-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: LucidDbHorizontalPartitioning and Firewater

Nicholas Goodman
On Dec 17, 2009, at 3:43 AM, OG wrote:

> I read http://pub.eigenbase.org/wiki/LucidDbHorizontalPartitioning ,  
> but I also saw references to Firewater.  Is Firewater meant to serve  
> as (an alternative?) mechanism to partition large LucidDB databases/
> tables over N servers, yet provide a unified view of them, so that  
> even joins work across partitions living on different servers?

Hi again Otis!

LucidDB currently has no immediate plans to enable MPP or partitioning  
inside a LucidDB instance.  Firewater is an "add on" project, which  
uses LucidDB instances to do what you are inquiring.  Firewater is  
still in R&D for sure, but JVS and DynamoBI are pushing it along  
slowly but surely.  Single table, and a variety of push down (for  
aggregations, etc) are already working but there's still plenty to do.

Here's a presentation I gave at OpenSQLCamp in Portland this year.  
It's got a couple of diagrams to see the deployment topology (with  
LucidDB and Firewater instances).

http://www.nicholasgoodman.com/entry_images/firewater_openssqlcamp2009.ppt

Kind Regards,
Nick

------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
luciddb-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/luciddb-users
OG
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: LucidDbHorizontalPartitioning and Firewater

OG
Hi,

I have a question about "LucidDB currently has no immediate plans to enable MPP or partitioning inside a LucidDB instance".

* Does that mean that LucidDB currently doesn't use multiple CPU cores in any situation? e.g. not during loading, not during single query execution, and not during execution of multiple concurrent queries

* Is there a Powered-by type list/wiki page where one can see who is using LucidDB and now?

* How large are the datasets people have used with LucidDB?  I realize knowing this doesn't give me a full picture - we'd need to know the types of queries, schema details, query concurrency, query latency, etc.... but I doubt such details have been published by any LucidDB users out there?


Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



----- Original Message ----

> From: Nicholas Goodman <[hidden email]>
> To: OG <[hidden email]>
> Cc: [hidden email]
> Sent: Thu, December 17, 2009 11:52:42 AM
> Subject: Re: [luciddb-users] LucidDbHorizontalPartitioning and Firewater
>
> On Dec 17, 2009, at 3:43 AM, OG wrote:
>
> > I read http://pub.eigenbase.org/wiki/LucidDbHorizontalPartitioning , but I
> also saw references to Firewater.  Is Firewater meant to serve as (an
> alternative?) mechanism to partition large LucidDB databases/tables over N
> servers, yet provide a unified view of them, so that even joins work across
> partitions living on different servers?
>
> Hi again Otis!
>
> LucidDB currently has no immediate plans to enable MPP or partitioning inside a
> LucidDB instance.  Firewater is an "add on" project, which uses LucidDB
> instances to do what you are inquiring.  Firewater is still in R&D for sure, but
> JVS and DynamoBI are pushing it along slowly but surely.  Single table, and a
> variety of push down (for aggregations, etc) are already working but there's
> still plenty to do.
>
> Here's a presentation I gave at OpenSQLCamp in Portland this year.  It's got a
> couple of diagrams to see the deployment topology (with LucidDB and Firewater
> instances).
>
> http://www.nicholasgoodman.com/entry_images/firewater_openssqlcamp2009.ppt
>
> Kind Regards,
> Nick


------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
luciddb-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/luciddb-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: LucidDbHorizontalPartitioning and Firewater

ngoodman

On Dec 21, 2009, at 9:54 AM, OG wrote:

> * Does that mean that LucidDB currently doesn't use multiple CPU  
> cores in any situation? e.g. not during loading, not during single  
> query execution, and not during execution of multiple concurrent  
> queries
LucidDB WILL NOT use multiple cores for a single query, you'd have to  
use Firewater for that.
LucidDB CAN use multiple cores for loading if the source is  
partitioned (a view on top of multiple flat files, etc).
LucidDB WILL use multiple cores when multiple queries are being  
executed.

> * Is there a Powered-by type list/wiki page where one can see who is  
> using LucidDB and now?
No, but that sounds like a good idea.

> * How large are the datasets people have used with LucidDB?  I  
> realize knowing this doesn't give me a full picture - we'd need to  
> know the types of queries, schema details, query concurrency, query  
> latency, etc.... but I doubt such details have been published by any  
> LucidDB users out there?
I think the largest publicly known dataset is the 1TB SSB benchmarks.  
I general, and people on this list can chime in, most people are using  
LucidDB with source datasets of 5-500GB, a few hundred million to a  
few billion rows which typically compresses down about 5-10x  
(depending).  Also, I think there's a general "star schema" approach.

Any other LucidDB'ers care to comment on their volumes/size/etc?

We're trying to get a couple DynamoDB customer use cases documented  
(referenceable).

Nick

------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
luciddb-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/luciddb-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: LucidDbHorizontalPartitioning and Firewater

Francisco Reyes-3
Nicholas Goodman writes:

> I general, and people on this list can chime in, most people are using  

Currently moving about 250GB (1.5Billion rows) to a Lucid Star schema.
Initially loaded in a non star schema just for easy testing vs the same
dataset in PostgreSQL. On the non star schema Lucid was about 4
times faster than postgres on the same hardware. A few queries Postgres
chose the wrong plan and Lucid was 100 times faster.. but once I manually
forced a different plan then Postgres did better and Lucid was "only" 4
times faster.

OG,
Why not just try it? No matter what works for others and how many companies
you find using Lucid or how big the data other people may have in Lucid, ultimately
you have to test it with your data.

What do you have your data on right now? Couldn't you just benchmark Lucid
vs whatever you have now?  


------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
luciddb-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/luciddb-users
OG
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: LucidDbHorizontalPartitioning and Firewater

OG
I don't have any data right now :)

 Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



----- Original Message ----

> From: Francisco Reyes <[hidden email]>
> To: Nicholas Goodman <[hidden email]>
> Cc: OG <[hidden email]>; [hidden email]
> Sent: Mon, December 21, 2009 1:55:04 PM
> Subject: Re: [luciddb-users] LucidDbHorizontalPartitioning and Firewater
>
> Nicholas Goodman writes:
>
> > I general, and people on this list can chime in, most people are using  
>
> Currently moving about 250GB (1.5Billion rows) to a Lucid Star schema.
> Initially loaded in a non star schema just for easy testing vs the same dataset
> in PostgreSQL. On the non star schema Lucid was about 4 times faster than
> postgres on the same hardware. A few queries Postgres chose the wrong plan and
> Lucid was 100 times faster.. but once I manually forced a different plan then
> Postgres did better and Lucid was "only" 4 times faster.
>
> OG,
> Why not just try it? No matter what works for others and how many companies you
> find using Lucid or how big the data other people may have in Lucid, ultimately
> you have to test it with your data.
>
> What do you have your data on right now? Couldn't you just benchmark Lucid vs
> whatever you have now?  


------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
luciddb-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/luciddb-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: LucidDbHorizontalPartitioning and Firewater

Francisco Reyes-3
OG writes:

> I don't have any data right now :)

Don't let that stop you. :-)

Install Lucid, play around loading some small ASCII files.

I remember how the manual said something along the lines... you will
probably find the import difficult to learn, but after you lear it you will
like it. I think it is true.

Depending on how much my company gets to use Lucid maybe we could give some
ETL code back to the project. Specially after we get the Postgres Proxy
working. We will likely always keep data in Postgres as our initial
load/storage and then have a second copy in Lucid so there will be lots of
ETL to setup.

------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
luciddb-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/luciddb-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: LucidDbHorizontalPartitioning and Firewater

John Sichi
Administrator
In reply to this post by ngoodman
Nicholas Goodman wrote:
> LucidDB WILL NOT use multiple cores for a single query, you'd have to  
> use Firewater for that.

One small refinement to this:  LucidDB will use multiple cores for a
single query in the case where it contains UDX invocations (n+1 threads,
where n is the number of UDX invocations).  This is not intended as a
means of parallelized speedup (although it may provide that as a
byproduct); it's just a necessary implementation artifact due to the
convenience of the UDX calling convention where the control lies in the
user code.

JVS

------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
luciddb-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/luciddb-users
Loading...