Delivered-To: garrigue at math.nagoya-u.ac.jp Delivered-To: lablgtk at yquem.inria.fr Date: Wed, 1 Dec 2010 19:51:24 +0100 From: Hugo Herbelin To: Jacques Garrigue Subject: Re: [coqdev] Re: [Lablgtk] Double free problem using bytecode Message-ID: <20101201185124.GA1551 at pirogue.inria.fr> Mail-Followup-To: Jacques Garrigue , lablgtk@yquem.inria.fr, coqdev@inria.fr References: <20101104113045.GA21006 at pirogue.inria.fr> <207204746.262198.1288950823998.JavaMail.root@zmbs3.inria.fr> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="zYM0uCDKw75PZbzx" Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <207204746.262198.1288950823998.JavaMail.root at zmbs3.inria.fr> Cc: lablgtk at yquem.inria.fr, coqdev at inria.fr Status: U --zYM0uCDKw75PZbzx Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit Hi Jacques, I found time to analyse further the problem. This has nothing to do with lablgtk. I can reproduce the bug with a few lines of ML that starts a thread running a code that forces a stack reallocation. Code is attached. I'll submit it to the caml bug tracker. Hugo On Fri, Nov 05, 2010 at 10:53:43AM +0100, Jacques Garrigue wrote: > This problem is quite mysterious, and it might be a bug in ocaml itself. > Since I cannot debug it myself, I can only tell you how lablgtk uses threads. > > Basically, lablgtk only allows ocaml threads to run, and does not use them > for itself. That is, when you use GtkThread.main, you replace the standard > main loop of Gtk by a custom loop which, at every iteration, (1) allows other > threads to run (Thread.delay 0.0001), and (2) runs the calls queued by > sync/async. Lablgtk itself never release ocaml's lock using caml_enter_blocking_section, > or run other threads explicitly. > So I don't see how it can break anything about threads. > > Now, if you use sync/async, Gtk will always be called from the same thread, > so that callbacks also will always come from the same thread. This may > explain why this is more robust, but normally this should not be required > on Unix (Gtk does not require it). Except if there is a restriction I don't understand > on bytecode threads. Maybe we should ask thread specialists about that. > > Jacques > > On 2010/11/04, at 20:30, Hugo Herbelin wrote: > > > Hi, > > > > We use lablgtk for the CoqIDE interface to Coq and we have "double > > free" problems on Linux with the bytecode version of the interface [1]. > > > > The problem is solved by ensuring that any call to gtk from the > > non-main thread are made using the sync/async functions. Is this then > > a discipline that is recommended not only for Windows but also for > > bytecode on Unix? > > > > Strangely, the problem is also solved by enforcing some dummy printing > > on stderr from time to time. > > > > Using valgrind, we could somehow trace the "double free" problem (see > > bottom of message). It shows that what is asked twice to be released > > is the pointer to the block used as a stack by the bytecode runtime, > > namely caml_stack_low: at some time an extension of the stack is done, > > a new bigger stack is allocated and the old caml_stack_low pointer is > > released. However, it looks like the updating of the caml_stack_low > > using the address of new stack gets broken when using gtk in threads > > since a second request for releasing the (old) value of the > > caml_stack_low pointer happens later on (unfortunately, but that may > > be also a hint, valgrind has no information on the call stack of the > > second call to free). > > > > Does anyone has an idea of what happens and of why using sync/async > > solves the problem? > > > > Hugo Herbelin and Stéphane Glondu > > with the additional help of Vincent Gross > > > > [1] http://www.lix.polytechnique.fr/coq/bugs/show_bug.cgi?id=2062 > > > > ---------------------------------------------------------------------- > > valgrind trace > > > > ==755== Command: bin/coqide.byte > > ==755== > > ==755== Thread 3: > > ==755== Invalid free() / delete / delete[] > > ==755== at 0x4C240FD: free (vg_replace_malloc.c:366) > > ==755== Address 0x10c3dff0 is 0 bytes inside a block of size 8,192 free'd > > ==755== at 0x4C240FD: free (vg_replace_malloc.c:366) > > ==755== by 0x40AEF1: caml_stat_free (in /usr/local/bin/ocamlrun) > > ==755== by 0x40AD1C: caml_realloc_stack (in /usr/local/bin/ocamlrun) > > ==755== by 0x41B0BD: caml_interprete (in /usr/local/bin/ocamlrun) > > ==755== by 0x41706C: caml_callbackN_exn (in /usr/local/bin/ocamlrun) > > ==755== by 0x417164: caml_callback_exn (in /usr/local/bin/ocamlrun) > > ==755== by 0x69570A8: caml_thread_start (in /usr/local/lib/ocaml/stublibs/dllthreads.so) > > > > _______________________________________________ > > Lablgtk mailing list > > Lablgtk@yquem.inria.fr > > http://yquem.inria.fr/cgi-bin/mailman/listinfo/lablgtk > > --zYM0uCDKw75PZbzx Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="thread_bug.ml" (* use: "ocamlc -o thread_bug -thread unix.cma threads.cma thread_bug.ml" *) let rec f x = 1 + (if x = 0 then 0 else f (x-1)) let _ = Thread.create f 200 (* After a few seconds, sometimes a minute, thread exits and a double free corruption is detected *) let _ = while true do () done --zYM0uCDKw75PZbzx Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Lablgtk mailing list Lablgtk@yquem.inria.fr http://yquem.inria.fr/cgi-bin/mailman/listinfo/lablgtk --zYM0uCDKw75PZbzx--