|
Home > Archive > microsoft.public.sqlserver.server > November 2002 > Horizontally Partitioning Tables.
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Horizontally Partitioning Tables.
|
|
| Bruce de Freitas 2002-11-21, 10:24 am |
| Chandru, how is the data being accessed? Is there some
unique single column key maybe? Meaning.. can someone
access data in any given query from ANY of the tables,
assuming you split into 10 million row tables? Maybe how
you split up the data MIGHT be more important knowing how
it's accessed... like.. instead of strictly having 10
million rows per table, maybe have tables by an exact
WEEK? so you know all data from the first week of the 26
weeks total is in table_Week_1?? Roll them off every 26
weeks?!? Then you could have one or more views that UNION
all those WEEKLY tables together. But this helps if you
know how query's selection criteria are coded. do you?
or can anyone request any combo of data across the 6
months?
I'd think managing a set of smaller tables would be
better, and you'd have indexes on each table, so access
could be fast. This also depends on search criteria.
Space-wise, you'll take up more this way because of the
extra indexes... Bruce
>-----Original Message-----
>Hi all,
>
>A general architecture question about SQL Server 2000. I
have a given table
>A, where the only data modification operation are
Inserts. The table grows
>at the rate of a million rows a day, and I need to keep
the data around for
>atleast 6 months. Assuming that I have only 1 Server and
cannot distribute
>the load using partition views:
>
>1. Would it be better to horizontally partition the
tables into chunks of 10
>million rows per table? This would isolate the inserts
into only the current
>table based upon an identity column. The selects would
then be distributed
>among the rest of the tables?
>
>or
>
>2. Just have 1 huge table which would grow upto 200
million rows?
>
>Thanks in Advance,
>CR
>
>
>
>.
>
| |
| Chandru R 2002-11-21, 10:24 am |
| Bruce,
The data access will be on the Primary Key i.e. an incrementing value and
will only access 1 table at a time. The tables will be partitioned on the
Primary Key
i.e. Table1: 1 - 10000000
Table 2: 10000001 - 20000000
.....
I agree to that a bunch of smaller tables would be effecient, not really
worried abt space.
Thanks
Chandru
"Bruce de Freitas" <bruce@defreitas.com> wrote in message
news:e50c01c29174$0a46ef50$8af
82ecf@TK2MSFTNGXA03...
> Chandru, how is the data being accessed? Is there some
> unique single column key maybe? Meaning.. can someone
> access data in any given query from ANY of the tables,
> assuming you split into 10 million row tables? Maybe how
> you split up the data MIGHT be more important knowing how
> it's accessed... like.. instead of strictly having 10
> million rows per table, maybe have tables by an exact
> WEEK? so you know all data from the first week of the 26
> weeks total is in table_Week_1?? Roll them off every 26
> weeks?!? Then you could have one or more views that UNION
> all those WEEKLY tables together. But this helps if you
> know how query's selection criteria are coded. do you?
> or can anyone request any combo of data across the 6
> months?
>
> I'd think managing a set of smaller tables would be
> better, and you'd have indexes on each table, so access
> could be fast. This also depends on search criteria.
> Space-wise, you'll take up more this way because of the
> extra indexes... Bruce
>
>
> >-----Original Message-----
> >Hi all,
> >
> >A general architecture question about SQL Server 2000. I
> have a given table
> >A, where the only data modification operation are
> Inserts. The table grows
> >at the rate of a million rows a day, and I need to keep
> the data around for
> >atleast 6 months. Assuming that I have only 1 Server and
> cannot distribute
> >the load using partition views:
> >
> >1. Would it be better to horizontally partition the
> tables into chunks of 10
> >million rows per table? This would isolate the inserts
> into only the current
> >table based upon an identity column. The selects would
> then be distributed
> >among the rest of the tables?
> >
> >or
> >
> >2. Just have 1 huge table which would grow upto 200
> million rows?
> >
> >Thanks in Advance,
> >CR
> >
> >
> >
> >.
> >
| |
| Bruce de Freitas 2002-11-21, 11:24 am |
| oh, so this is a single row select where you always know
the PK you're searching for? Not a multi-row select for
some aggregation... so, yes, it sounds like if you have a
clustered PK using BIGINT, the separate smaller indexes
might perform better...
I'd still think about the table cutoff being more like by
time period, then just a number. Kind of the same thing,
but easier to know that table_3 is the 3rd week, in CASE
you have to go there and fix data, etc... Can schedule
jobs easier that way too, if it's a 23.9x7 application...
Might want to union the separate tables togther in a VIEW,
unless you plan to know which specific table a PK sits
in? Again, maybe back to some DATE field, if the tables
are in DATE sequence? or keep a master table to point you
at which of the 26 tables your data is in... Bruce
>-----Original Message-----
>Bruce,
>
>The data access will be on the Primary Key i.e. an
incrementing value and
>will only access 1 table at a time. The tables will be
partitioned on the
>Primary Key
>i.e. Table1: 1 - 10000000
> Table 2: 10000001 - 20000000
>
>.....
>
>I agree to that a bunch of smaller tables would be
effecient, not really
>worried abt space.
>
>Thanks
>Chandru
>
>
>"Bruce de Freitas" <bruce@defreitas.com> wrote in message
> news:e50c01c29174$0a46ef50$8af
82ecf@TK2MSFTNGXA03...
>> Chandru, how is the data being accessed? Is there some
>> unique single column key maybe? Meaning.. can someone
>> access data in any given query from ANY of the tables,
>> assuming you split into 10 million row tables? Maybe
how
>> you split up the data MIGHT be more important knowing
how
>> it's accessed... like.. instead of strictly having 10
>> million rows per table, maybe have tables by an exact
>> WEEK? so you know all data from the first week of the 26
>> weeks total is in table_Week_1?? Roll them off every 26
>> weeks?!? Then you could have one or more views that
UNION
>> all those WEEKLY tables together. But this helps if you
>> know how query's selection criteria are coded. do you?
>> or can anyone request any combo of data across the 6
>> months?
>>
>> I'd think managing a set of smaller tables would be
>> better, and you'd have indexes on each table, so access
>> could be fast. This also depends on search criteria.
>> Space-wise, you'll take up more this way because of the
>> extra indexes... Bruce
>>
>>
>> >-----Original Message-----
>> >Hi all,
>> >
>> >A general architecture question about SQL Server 2000.
I
>> have a given table
>> >A, where the only data modification operation are
>> Inserts. The table grows
>> >at the rate of a million rows a day, and I need to keep
>> the data around for
>> >atleast 6 months. Assuming that I have only 1 Server
and
>> cannot distribute
>> >the load using partition views:
>> >
>> >1. Would it be better to horizontally partition the
>> tables into chunks of 10
>> >million rows per table? This would isolate the inserts
>> into only the current
>> >table based upon an identity column. The selects would
>> then be distributed
>> >among the rest of the tables?
>> >
>> >or
>> >
>> >2. Just have 1 huge table which would grow upto 200
>> million rows?
>> >
>> >Thanks in Advance,
>> >CR
>> >
>> >
>> >
>> >.
>> >
>
>
>.
>
|
|
|
|
|