Jan-Henrik,
I'm running into some problems with the mysql backend. I can't seem to retrieve long-ish strings.
The attached code will reliably segfault on my systems.
This code does *not* fail on Postgresql.
Am I doing something wrong?
thanks,
Ok, I get it, I think.
Reading the source I come to the conclusion that I should use ResetSet_readData for any column wider than STRLEN chars rather than _getString or _getBlob
kind of defeats the purpose of those last two methods, doesn't it?
Or am I still missing something here?
Paul J Stevens wrote:
Jan-Henrik,
I'm running into some problems with the mysql backend. I can't seem to retrieve long-ish strings.
The attached code will reliably segfault on my systems.
This code does *not* fail on Postgresql.
Am I doing something wrong?
thanks,
-- To unsubscribe: http://www.tildeslash.com/mailman/listinfo/libzdb-general
On 19. mars. 2008, at 15.40, Paul J Stevens wrote:
Ok, I get it, I think.
Reading the source I come to the conclusion that I should use ResetSet_readData for any column wider than STRLEN chars rather than _getString or _getBlob
No, _getString and _getBlob should of course work with any column length data. This looks to be a number overflow bug of some sorts in ensureCapacity() according to Martin, which we of course will fix. I'm just stumped we haven't seen this before, ourselves. Must be that we mainly work with small data and using SQLite.
Jan-Henrik
kind of defeats the purpose of those last two methods, doesn't it?
Or am I still missing something here?
Paul J Stevens wrote:
Jan-Henrik,
I'm running into some problems with the mysql backend. I can't seem to retrieve long-ish strings.
The attached code will reliably segfault on my systems.
This code does *not* fail on Postgresql.
Am I doing something wrong?
Jan-Henrik Haukeland wrote:
On 19. mars. 2008, at 15.40, Paul J Stevens wrote:
Ok, I get it, I think.
Reading the source I come to the conclusion that I should use ResetSet_readData for any column wider than STRLEN chars rather than _getString or _getBlob
No, _getString and _getBlob should of course work with any column length data. This looks to be a number overflow bug of some sorts in ensureCapacity() according to Martin, which we of course will fix. I'm just stumped we haven't seen this before, ourselves. Must be that we mainly work with small data and using SQLite.
Glad to hear you guys are on it. I'm already quite seriously committed to using zdb. I'm still single threaded, but my database code is fully updated. I've begun putting some serious pressure on the postgres driver which is holding out great. That means I really need to deal with any surprises in the other drivers. But your code is a pleasure to work with so far.
I was zooming in on ensureCapacity myself. Isn't mysql_stmt_store_result required to determine the width of a column? R->columns[i]->length appears to stay 0. I also can't see an obvious solution that avoids allocating rows*max_length for each MysqlResultSet. But I don't have any experience with the mysql_stmt api (yet).
Anyway, curious to see what Martin comes up with.
Hi,
the problem seems to be in the ensureCapacity():
--8<-- static int ensureCapacity(T R, int i) { if ((R->columns[i]->length > R->bind[i].buffer_length)) { /* Assume truncation, resize and fetch column directly. */ unsigned long newSize = (sizeof *(R->columns[i]) + 1 + R->columns[i]->field->length); memset(&R->bind[i], 0, sizeof(MYSQL_BIND)); R->bind[i].buffer_type = R->columns[i]->field->type; RESIZE(R->columns[i], newSize); R->bind[i].buffer = R->columns[i]->buffer; R->bind[i].buffer_length = R->columns[i]->field->length; R->bind[i].is_null = &R->columns[i]->is_null; R->bind[i].length = &R->columns[i]->length; R->columns[i]->field = mysql_fetch_field_direct(R->meta, i); if (mysql_stmt_fetch_column(R->stmt, &R->bind[i], i, 0)) { THROW(SQLException); return false; } } return true; } --8<--
the R->columns[1]->length is bigger then R->columns[1]->field->length ... which is 4GB (MySQL BLOB range)
(gdb) p R->columns[1]->length $26 = 648
(gdb) p R->columns[1]->field->length $27 = 4294967295
The newSize is unsigned long ... 4GB for ILP32 => the newSize overflows:
(gdb) p newSize $28 = 268
The R->columns[i] is resized to newSize (wrong length) and the R->bind[i].buffer_length is set to 4GB
The mysql_stmt_fetch_column() then corrupts the memory.
... the full *R->columns[1] structure (before reallocation):
(gdb) p *R->columns[1] $15 = {is_null = 0 '\0', field = 0x807de00, length = 648, buffer = "From nobody@pacific.net.sg Tue Dec 04 19:52:17 2007\nX-Envelope-From: nobody@pacific.net.sg\nReceived: from [127.0.0.1] (port=49353 helo=test11)\n by centos.nowhere.com with smtp (Exim 4.63)\n "...}
(gdb) p *R->columns[1]->field $16 = {name = 0x807dea0 "data", org_name = 0x807dea8 "data", table = 0x807de90 "test", org_table = 0x807de98 "test", db = 0x807de88 "test", catalog = 0x807de80 "def", def = 0x0, length = 4294967295, max_length = 0, name_length = 4, org_name_length = 4, table_length = 4, org_table_length = 4, db_length = 4, catalog_length = 3, def_length = 0, flags = 4241, decimals = 0, charsetnr = 63, type = MYSQL_TYPE_BLOB}
Martin
Paul J Stevens wrote:
Jan-Henrik Haukeland wrote:
On 19. mars. 2008, at 15.40, Paul J Stevens wrote:
Ok, I get it, I think.
Reading the source I come to the conclusion that I should use ResetSet_readData for any column wider than STRLEN chars rather than _getString or _getBlob
No, _getString and _getBlob should of course work with any column length data. This looks to be a number overflow bug of some sorts in ensureCapacity() according to Martin, which we of course will fix. I'm just stumped we haven't seen this before, ourselves. Must be that we mainly work with small data and using SQLite.
Glad to hear you guys are on it. I'm already quite seriously committed to using zdb. I'm still single threaded, but my database code is fully updated. I've begun putting some serious pressure on the postgres driver which is holding out great. That means I really need to deal with any surprises in the other drivers. But your code is a pleasure to work with so far.
I was zooming in on ensureCapacity myself. Isn't mysql_stmt_store_result required to determine the width of a column? R->columns[i]->length appears to stay 0. I also can't see an obvious solution that avoids allocating rows*max_length for each MysqlResultSet. But I don't have any experience with the mysql_stmt api (yet).
Anyway, curious to see what Martin comes up with.
Its fixed. You can check out the code from the repository at http://code.google.com/p/libzdb/source/checkout . re2c (http://re2c.org/ or apt-get re2c) is needed to build from scratch. Call bootstrap first, then configure.
To save memory, the MySQL driver is designed not to slurp down large data sets, but read as needed. Each column is preallocated with a 256 byte buffer. This is more than enough for numbers and usually enough for strings, as real life strings often tend to be shorter. Buffers are reallocated if needed, it was this reallocation that had a bug which should now be fixed.
We have mainly been working on developing an application using the SQLite driver and haven't tested all drivers of libzdb _extensively_ in a real application. Its unfortunate, but at the same time its great that you are able to test all parts and we'll try our best to respond as soon as we can. I hope you haven't lost all confidence in the library. The code should be well designed and easy to read and if you want too, you are most welcome to provide patches if more trouble should arise, though I hope we have ironed out most now.
We'll release a new version as soon as the postgresql problem you also reported has been fixed. I don't use postgres myself and I believe Martin is on vacation, so if you want to give it a stab I'll be happy to accept a patch :-)
Jan-Henrik
On 17. mars. 2008, at 16.32, Paul J Stevens wrote:
Jan-Henrik,
I'm running into some problems with the mysql backend. I can't seem to retrieve long-ish strings.
The attached code will reliably segfault on my systems.
This code does *not* fail on Postgresql.
Am I doing something wrong?
thanks,
Jan-Henrik Haukeland wrote:
Its fixed. You can check out the code from the repository at http://code.google.com/p/libzdb/source/checkout .. re2c (http://re2c.org/ or apt-get re2c) is needed to build from scratch. Call bootstrap first, then configure.
Great!
btw, glad you selected svn rather than cvs @ savanna (yuk). I'm an avid git fan myself, but I hear good things about mercurial as well.
But let's not digress.
To save memory, the MySQL driver is designed not to slurp down large data sets, but read as needed. Each column is preallocated with a 256 byte buffer. This is more than enough for numbers and usually enough for strings, as real life strings often tend to be shorter. Buffers are reallocated if needed, it was this reallocation that had a bug which should now be fixed.
I'll test it later today. Thanks for working on it.
We have mainly been working on developing an application using the SQLite driver and haven't tested all drivers of libzdb _extensively_ in a real application. Its unfortunate, but at the same time its great that you are able to test all parts and we'll try our best to respond as soon as we can. I hope you haven't lost all confidence in the library. The code should be well designed and easy to read and if you want too, you are most welcome to provide patches if more trouble should arise, though I hope we have ironed out most now.
Ha, don't worry. I'm already well beyond the point of chucking zdb just because of some small bugs. It's the big picture feature set of zdb combined with the very clean design and coding style you use that was the main selling point.
We'll release a new version as soon as the postgresql problem you also reported has been fixed. I don't use postgres myself and I believe Martin is on vacation, so if you want to give it a stab I'll be happy to accept a patch :-)
I've started to mirror your svn repo on git.dbmail.eu so I can maintain the /debian/ stuff, and play with exploratory zdb changes of my own. Once I'm done with my current code sprint in dbmail (libevent/libzdb/pthreads) I hope to be able to start working on an ingres driver for libzdb. Should be fun.
keep you posted.
Jan-Henrik,
A bit late in the afternoon, I'm testing your recent mysql fix. Seems to work great, but I found a regression in URL.re
Whenever I specify a port in the url, I get:
AssertException: ThreadData_set(Exception_stack, &Exception_frame)==0 raised in parseURL at src/net/URL.re:444 Aborted (core dumped)
Using the testzdb.c I sent before:
this fails:
URL_T url = URL_new("mysql://test:test@localhost:3306/test");
where this works:
URL_T url = URL_new("mysql://test:test@localhost/test");
Jan-Henrik Haukeland wrote:
On 21. mars. 2008, at 13.48, Paul J Stevens wrote:
Once I'm done with my current code sprint in dbmail (libevent/libzdb/ pthreads) I hope to be able to start working on an ingres driver for libzdb. Should be fun.
Great! looking forward to see how it goes.
To unsubscribe: http://www.tildeslash.com/mailman/listinfo/libzdb-general
On 21. mars. 2008, at 16.58, Paul J Stevens wrote:
Jan-Henrik,
A bit late in the afternoon, I'm testing your recent mysql fix. Seems to work great, but I found a regression in URL.re
Whenever I specify a port in the url, I get:
AssertException: ThreadData_set(Exception_stack, &Exception_frame)==0 raised in parseURL at src/net/URL.re:444 Aborted (core dumped)
Ah thanks for catching this. I have checked in a fix. Because an exception handler is installed when testing the URL port number, initialization of the exception stack has now been moved to the URL class, as URL is first used. I did not catch this, as all my tests programs either does not use a port number in the URL or the exception stack is explicit initialize in unit test programs.
Jan-Henrik,
below patch is still required to build with postgresql support
http://git.dbmail.eu/?p=paul/libzdb;a=commitdiff;h=2f3b8212802427710ddb344e1...
Jan-Henrik Haukeland wrote:
On 21. mars. 2008, at 16.58, Paul J Stevens wrote:
Jan-Henrik,
A bit late in the afternoon, I'm testing your recent mysql fix. Seems to work great, but I found a regression in URL.re
Whenever I specify a port in the url, I get:
AssertException: ThreadData_set(Exception_stack, &Exception_frame)==0 raised in parseURL at src/net/URL.re:444 Aborted (core dumped)
Ah thanks for catching this. I have checked in a fix. Because an exception handler is installed when testing the URL port number, initialization of the exception stack has now been moved to the URL class, as URL is first used. I did not catch this, as all my tests programs either does not use a port number in the URL or the exception stack is explicit initialize in unit test programs. -- To unsubscribe: http://www.tildeslash.com/mailman/listinfo/libzdb-general