Tuesday, March 27, 2012

Correct procedure for concurrent transactions

Hi,
I am struggeling to get my head around how to maximize throughput to
the server using several concurrent transactions. It seems I always
stumble into massive amount of deadlocks (in my test-environment)
which, the server dutyfully resolves, but performance suffers to below
concurrent values.
To illustrate my problem, I have one transaction which inserts around
50 interdepent rows in 10 tables.
Right now, the flow is something like ( pseudo language) :
BEGIN TRANSACTION
INSERT ROW IN A
SELECT ID FROM A
INSERT INTO B ID FROM TABLE A (along with other data)
INSERT INTO C ID FROM TABLE A (along with yet other data)
GET ID FROM TABLE C
INSERT INTO TABLE D ID FROM TABLE C
INSERT INTO C ID FORM TABLE A AND ID FROM TABLE C (in other column)
INSERT ROW IN A
INSERT INTO E ID FROM TABLE A
END TRANSACTION
(etc etc)
The exact amount of work inside the transaction is variable, but
rulebased. I.e. depends on the nature of the input from the end-user.
This is all fine, when I run it single-threaded (ofcourse), but when i
run it multithreaded, deadlocks will occur with alarming rapidity. I
know it can be minimized by careful selection of indexes etc, but what
I'm really after, is a best practices way of doing it. Currently I've
just dumb-ed it down by insureing that only one thread runs the
transaction at a time, and performance is ... ok. Not great, but ok.
Someone suggested lumping all accesses so they weren't spread out, i.e.
rewriting it to
INSERT INTO A
GET ID FROM A as id1
INSERT INTO A
GET ID FROM A as id2
INSERT INTO A
GET ID FROM A as id3
INSERT INTO B id1
INSERT INTO B id2
INSERT INTO C id1
GET ID FROM C as id6
INSERT INTO C id1,id6
INSERT INTO C id3
etc
This is a fairly complex job to perform just to see if it works, so if
anybody has any input I'd appreciate it. My app has to work with
severel DMBS's so I would like to minimize the sqlserver specific
parts.
I'm not sure how much the above would give, as thread 2 would halt on
the first INSERT INTO A until thread1 called commit or rollback.
Synchronization within my app is (obviously) not going to do any good,
since the DBMS will lock the tables for me.
Currently I'm using MSSQL2000, which is in the projects
minimum-requirements.
My tests have been using TRANSACTION_ISOLATION_SERIALIZABLE
So: Any clues? Any links? Any books that would help me here?
The problem only gets worse, when I factor in read-only operations, and
other updates to some of the tables. Any word on how that is best
practiced?
Thx in advance.
A couple things.
1. How are you getting the ID back?
2. Do you have proper indexes on each of the tables for these type
operations?
3. Why are you using Serializable mode? Nothing I see in your example
warrants that. Try Read Committed instead.
Andrew J. Kelly SQL MVP
<akj@.tmnet.dk> wrote in message
news:1106156798.520698.299220@.f14g2000cwb.googlegr oups.com...
> Hi,
> I am struggeling to get my head around how to maximize throughput to
> the server using several concurrent transactions. It seems I always
> stumble into massive amount of deadlocks (in my test-environment)
> which, the server dutyfully resolves, but performance suffers to below
> concurrent values.
> To illustrate my problem, I have one transaction which inserts around
> 50 interdepent rows in 10 tables.
> Right now, the flow is something like ( pseudo language) :
> BEGIN TRANSACTION
> INSERT ROW IN A
> SELECT ID FROM A
> INSERT INTO B ID FROM TABLE A (along with other data)
> INSERT INTO C ID FROM TABLE A (along with yet other data)
> GET ID FROM TABLE C
> INSERT INTO TABLE D ID FROM TABLE C
> INSERT INTO C ID FORM TABLE A AND ID FROM TABLE C (in other column)
> INSERT ROW IN A
> INSERT INTO E ID FROM TABLE A
> END TRANSACTION
> (etc etc)
> The exact amount of work inside the transaction is variable, but
> rulebased. I.e. depends on the nature of the input from the end-user.
> This is all fine, when I run it single-threaded (ofcourse), but when i
> run it multithreaded, deadlocks will occur with alarming rapidity. I
> know it can be minimized by careful selection of indexes etc, but what
> I'm really after, is a best practices way of doing it. Currently I've
> just dumb-ed it down by insureing that only one thread runs the
> transaction at a time, and performance is ... ok. Not great, but ok.
> Someone suggested lumping all accesses so they weren't spread out, i.e.
> rewriting it to
> INSERT INTO A
> GET ID FROM A as id1
> INSERT INTO A
> GET ID FROM A as id2
> INSERT INTO A
> GET ID FROM A as id3
> INSERT INTO B id1
> INSERT INTO B id2
> INSERT INTO C id1
> GET ID FROM C as id6
> INSERT INTO C id1,id6
> INSERT INTO C id3
> etc
> This is a fairly complex job to perform just to see if it works, so if
> anybody has any input I'd appreciate it. My app has to work with
> severel DMBS's so I would like to minimize the sqlserver specific
> parts.
> I'm not sure how much the above would give, as thread 2 would halt on
> the first INSERT INTO A until thread1 called commit or rollback.
>
> Synchronization within my app is (obviously) not going to do any good,
> since the DBMS will lock the tables for me.
> Currently I'm using MSSQL2000, which is in the projects
> minimum-requirements.
> My tests have been using TRANSACTION_ISOLATION_SERIALIZABLE
> So: Any clues? Any links? Any books that would help me here?
> The problem only gets worse, when I factor in read-only operations, and
> other updates to some of the tables. Any word on how that is best
> practiced?
> Thx in advance.
>
|||1) The table has an autoincrementing column, so I insert into it, and
read back the max value. For this, I need to be in serializable mode.
2) Yes; everything is indexed correctly, and constrained with foreign
keys.
3) Since read-committed would return a different max() for the ID I
think I need to be in serializable mode, right?
I know I could use some sql-server specific way of retrieving the IDs,
but that wouldnt alleiviate the locking occuring on the first ID
tables, that are held, until my transaction commits or rolls back.
|||You will never get good performance in generating the id's that way. For
one you are essentially making the application single user with the over use
of the Serialization Isolation level. As such you force the app to do things
one at a time and all the others must wait until the ones before them are
done. Either use an IDENTITY() datatype or generate your own ID's with
something like this:
CREATE TABLE [dbo].[NEXT_ID] (
[ID_NAME] [varchar] (20) NOT NULL ,
[NEXT_VALUE] [int] NOT NULL ,
CONSTRAINT [PK_NEXT_ID_NAME] PRIMARY KEY CLUSTERED
(
[ID_NAME]
) WITH FILLFACTOR = 100 ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE PROCEDURE get_next_id
@.ID_Name VARCHAR(20) ,
@.ID int OUTPUT
AS
UPDATE NEXT_ID SET @.ID = NEXT_VALUE = (NEXT_VALUE + 1)
WHERE ID_NAME = @.ID_Name
RETURN (@.@.ERROR)
That will allow you to generate ids for different tables at the same time
and there is no need for increasing the isolation level. YOu can generate
all the ID's for each of the tables before you Begin the transaction and
then simply do the inserts.
Andrew J. Kelly SQL MVP
<akj@.tmnet.dk> wrote in message
news:1106221353.508792.46620@.c13g2000cwb.googlegro ups.com...
> 1) The table has an autoincrementing column, so I insert into it, and
> read back the max value. For this, I need to be in serializable mode.
> 2) Yes; everything is indexed correctly, and constrained with foreign
> keys.
> 3) Since read-committed would return a different max() for the ID I
> think I need to be in serializable mode, right?
> I know I could use some sql-server specific way of retrieving the IDs,
> but that wouldnt alleiviate the locking occuring on the first ID
> tables, that are held, until my transaction commits or rolls back.
>

No comments:

Post a Comment