(random unfinished engineering notes)

Monday, December 15, 2008

cURL out of memory on Xen instance (use and abuse series)

cURL, a part of everyone's favorite UNIX tool subset, got me into a bit of trouble recently, while trying to post a relatively large file, following a common 'just curl it' - logic (so commonplace that a lot of major projects simply incorporate curlin' as a part of standard deploy procedure).

The case was posting 8Gb file on 16Gb Xen instance. While this worked quite nice on real box, on virtual box curl said hello with :

out of memory

Now that seemed quite bizzare. Figurng out that the process actually gets ENOMEM, it was logical to look at curl code and figure out what's going on.

And there it was, power-of-two allocator in the file read loop :


static ParameterError file2memory(char **bufp, size_t *size, FILE *file)
...

char *newbuf;
char *buffer = NULL;
size_t alloc = 512;
size_t nused = 0;
size_t nread;

do {
if(!buffer || (alloc == nused)) {
/* size_t overflow detection for huge files */
if(alloc+1 > ((size_t)-1)/2) {
if(buffer)
free(buffer);
return PARAM_NO_MEM;
}
alloc *= 2;

if((newbuf = realloc(buffer, alloc+1)) == NULL) {
if(buffer)
free(buffer);
return PARAM_NO_MEM;
}
buffer = newbuf;
}
}




Whoa :) - now apparently someone didn't expect some geniuses will try po post XX Gb files with curl - so it's the abusers that are to blame. Stop abusing curl and do your own posts !

However, if you don't have the time to change your app, and still want to post files of the size (N,2N) Gb on a 2N Gb box, a simple hack of given form should do it :


if (alloc < ALLOC_THRESHOLD)
alloc *= 2;
else
alloc = alloc + ALLOC_THRESHOLD;



(Where ALLOC_THRESHOLD would usually be 1Gb)

This should make allocation linear, rather than exponential, once the allocated memory passes given threshold.


Now - what does is all has to do with XEN, you might ask?.

Couple of things, actually. First off, such environment (local or any virtualized cloud platform offering xen instances) usually provide user with something like effective 2^N - penalty memory space (say 15Gb instead of 16Gb) - and that's where the impact of power of two allocator becomes apparent much sooner. Also - memory allocation policies are quite stricter and enomems are dispatched much earlier, oom killer is fast on the trigger, etc :) - so that's why the curl ooms immediately, rather than trying to make that darn realloc() after all.

Moral of the story - don't abuse standard unix tools !
Be nice to curl - do not POST binary data larger than 50% of effective RAM.
Keep it safe !

Thursday, July 10, 2008

Instrumenting Java code for fun&profit

By code instrumentation we assume a proces of adding bytecodes to methods in order to intercept their execution (usually for the profiling purpose)
Some of the purposes include : code tracing (getting the method calls and regenerating the call tree), code profiling (getting the execution times between calls, automatic detection of bottlenecks etc), code monitoring (event/method invocation detection, code state monitoring - getting the structure size etc), and all of the above(for eaxmple, monitoring performance dependent of the code state, size of structures, number of predefined objects, etc)


The java.lang.instrument interface, available since java 1.5, enables us to write a handler,acessible via agent loaded into JVM, to which, all class load requests are passed. This enables us to calculate time,performance, analyze state, or even instrument the classes loaded.

the instrumentation is done via modification of method bytecodes

Creating a instrument agent is pretty straightforward :

- the Premain-Class is defined => which must implement premain method (taking a args string, and a Instrumentation object instance,which is created by jvm and automatically passed to the premain method of Premain-Class). The Instrumentation instance is then passed on, and can be used to monitor list and usage of all loaded/initialized classes, and add/remove transformers and redefine classes).

.....

- what we need to do next in order to reconstruct the full call-tree trace, is the information when which method ends. We cannot get this information via classloader, especially not in execution-time (we might try to get some post-exec via call stack reconstuction). However - we can get the return info the other way -> and that is via mangling the bytecode using the transformation framework and BCEL bytecode engineering library. We are simply altering the bytecode either via creating a wrapper method or by notifying the static method from the added instruction before the final return in the method. All in all this gives us all the necessary ingredients for making a proper runtime-profiler-call-dinamic-executinon-tracing-whatever :) Anyway, once i get it all together and start writing decent posts, i might write a proper post about it :)



References :

http://java.sun.com/j2se/1.5.0/docs/api/java/lang/instrument/package-summary.html
http://www.javalobby.org/java/forums/t19309.html
http://www.ibm.com/developerworks/java/library/j-jtp09196/index.html
http://www.cs.nuim.ie/~jpower/Research/instrument/
http://www-128.ibm.com/developerworks/java/library/j-jip/

Wednesday, June 25, 2008

Profiling java apps using DTrace

- monitoring operational performance of java apps
- finding bottlenecks
- examining cache effects
- example : heavily IO-bound app
- OS/X / Solaris

idea :
(a) current java profiling options : (what/where)
(b) java options on system/level profiling (we can debug / but cannot to the strace-like detailed to-syscall-mapping, oprofile-like stuff etc)

dtrace - cool caps :



references :

http://www.devx.com/Java/Article/33943/0/page/3
http://www.solarisinternals.com/wiki/index.php/DTrace_Topics_Java
http://developers.sun.com/solaris/articles/java_on_solaris.html
http://developers.sun.com/solaris/articles/dtrace_ajax.html

Saturday, March 29, 2008

Managing shared state in Erlang

Though a functional language, with no apparent shared state, we can trivially implement state in erlang by (ab)using the single-bounded value concept (each value can be bound only once)

whereis(list_to_atom("portServer" ++ integer_to_list(PORT))).



Bloom filters for profit & fun

Perfect Hash Functions for fun and profit

Thursday, March 27, 2008

Breakpoint

Okie, cut here

This is another attempt into actually starting to write something ;)
So, stay tuned for a bunch of stupid posts :) I need something in order to get started. Hopefuly, I might be able to actually post something usefull some day. (in which case i will erase all the posts until that day :) ). So, some value after all .... these posts won't last forever .... :) :) read them while they are still here ... and have mercy :)