ACL 2013, Day 1

Association for Computational Linguistic (ACL) conference is one of the top ranked conferences in the field of natural language processing. This year I am attending ACL 2013 (http://www.acl2013.org/site/) in Sofia, Bulgaria. Surprisingly, the main conference sponsor is Baidu. My presentation is scheduled on Friday, 9. Aug 2013 at 16:30 (GMT+2) within the BioNLP Workshop, Gene Regulation Network Shared Task, which Marinka Žitnik and I won.

Yesterday I flew via Vienna to International Airport Sofia and took a cab to William Gladstone Street 44, where is my hotel – Art’Otel and I am staying here until saturday. The hotel is nothing special, but it is good enough (In my opinion not worth of all ****) and quite close to the conference venue – 10minute walk.

First impressions about Sofia: I thought Bulgaria was in very bad condition, but it is not. Cars are normal, people look European-like, streets are clean. I also like the peace in the city. There is no rush, not so car-overcrowded and there are exactly enough people on the streets.  The only thing one would notice is that most of the buildings are older. For instance, the National Center of Culture (NDK) is enormous building, very nice, but should be renovated to look more modern. The same goes for the park in front of it, etc.. So to conclude, the only thing Sofia would need in my opinion is buildings renovation.

Today, Sun, August 4th, there was a tutorial day at the conference. There were four parallel tutorial in the morning and another four in the afternoon.

In the morning I attended the tutorial Variational Inference for Structured NLP Models by David Burkett and Dan Klein.

The tutorial was very informative and greatly presented. The focus of the talk was how to efficiently implement inference over already given factor graph with static structure. The intro started with introduction into HMM and then into different CRF types (linear, arbitrary, tree-like). Firstly, we were introduced to inference using Mean Field and then its approximation when trying to learn two interdependent labeling tasks. We continued with the problem of joint parsing and alignment. Lastly, we were talking about (“loopy”) belief propagation and using it for inference of dependency parsing.

During the lunch break I went to Boom – this appears to be the best place to eat Burger in Sofia. I got this inside info by my friend Didka (like there will be many other tips during my stay in Bulgaria :)).

In the afternoon I attended the tutorial Robust Automated Natural Language Processing with Multiword Expressions and Collocations by Valia Kordoni and Markus Egg.

The talk was about identifying multiword expressions, for example “take the clothes off”, which means the same as “undress”. I saw no technical information of algorithms, approaches, …, just raw history of research in this fields, so therefore I went to another tutorial session after the coffee break, even though I had not apply for it.

I moved to the Exploiting Social Media for Natural Language Processing: Bridging the Gap between Language-centric and Real-world Applications by Simone Paolo Ponzetto and Andrea Zielinksi.

This tutorial was a bit more interesting, but kept on a very general level. Friends later told me that the first part was better as there were more technical details given. The second part was a review of work in entity and event extraction from twitter along with some practical systems presentations. For example, the talk focused into extraction of person names, e.g. “Steve Jobs” and events, e.g. “DEATH”. Two interesting systems were about earthquake reporting and location-based disease information aggregation.

In the evening there was Welcome reception at Sky Plaza – on the top of NDK. There we got Bulgarian food, drinks and some live music. After few hours of mingling I went back to the hotel and here I am writing this post …

Kuhinjski podvig: Priprava jagodne rolade

IMAG0372Jagodna rolada mi je ena izmed najljubših sladic, zato sem se odločil, da jo poskusim pripraviti še sam. V tej objavi bom opisal celotno pripravo, ki je podobna receptu iz Kulinarika.net (http://www.kulinarika.net/recepti/6398/sladice/jagodna-rolada/).

 

Sestavine:

  • 5 jajc
  • 100g moke
  • 1 pecilni prašek
  • 90g sladkor v prahu
  • jagode
  • rastlinska smetana
  • sladki greh

Priprava:

Prve sestavine, ki jih potrebujemo, da izdelamo biskvit:
IMAG0324

Najprej zmešamo jajca in sladkor v prahu.

IMAG0326IMAG0327IMAG0329

 

Potem tej mešanici dodamo še moko in pecilni prašek. Pri tem moramo paziti, da zadevo delamo v dovolj veliki posodi. Jaz sem moral vmes pridobiti večjo :).

IMAG0330IMAG0332IMAG0333

 

Nato obložimo pekač s peki papirjem, ga malo namažemo z oljem in vanj vlijemo maso za biskvit. Vse skupaj nato pečemo cca. 15-20 min pri 180 stopinjah.

IMAG0334IMAG0335IMAG0337
IMAG0338

Po peki, biskvit nekaj trenutkov pustimo, da se malenkost ohladi in ga skupaj s peki papirjem zvijemo v rolo.Po cca. 10 min ga razvijemo. Tedaj ga je dobro namočiti s kakim likerjem – jaz sem to storil z jogurtom. Temu sledi namaz rastlinske smetane in dodajanje jagod. Na koncu biskvit še zvijemo v rolado.

IMAG0341
IMAG0343
IMAG0344IMAG0348IMAG0350IMAG0351

 

Nazadnje sledi še zunanja obdelava s smetano, jagodami in sladkim grehom. Ko končamo, postavimo rolado v hladilnik in jo postrežemo naslednji dan.

IMAG0352IMAG0355IMAG0357IMAG0361IMAG0363IMAG0366

 

Postrežen končni izdelek z dodatkom sladoleda in mrvic, je izgledal takole: 

IMAG0372

Izboljšave za prihodnjič:
Potrebno je uporabiti le 1 pecilni prašek. Testo mora biti tanjše in bolj namočeno z likerjem.

Optilab SCI Talk S02E01 – An Introduction into Entity Detection

I am posting a first lecture of second season of Optilab’s Science Talks. The recording was a pilot project, but from now on, all lectures will be professionally recorded and published.

The aim of this talk is to give a brief introduction into basic data mining methods, present the problem and lastly explain the current solution to uncover entites from text. The next lecture will continue the topic by presenting the Hidden Markov Models algorithm and will be aired in May, 2013.

Slides:

Google Hangout recording on Youtube:

How to integrate two independent jQuery libraries within a single page?

It has been long since my last post. That is not because I have nothing useful to write, but more likely that I am lazy to document interesting stuff I do…. Anyway, let’s go to the point!

Recently I had to integrate Lightbox 2 into a website that is run by a CMS. There was no possibility of FTP access, PHP source code access, … The only thing I could use was CMS’s static content form.

Surprisingly I could add the following code as a HTML form content:

<script type="text/javascript" src="http://zitnik.si/temp/lightbox/js/jquery-1.7.2.min.js"></script> <script type="text/javascript"> // <![CDATA[ document.write("<link href='http://zitnik.si/temp/lightbox/css/lightbox.css' rel='stylesheet' />"); // ]]> </script> <script type="text/javascript" src="http://zitnik.si/temp/lightbox/js/lightbox.js"></script>

These lines just load appropriate jQuery library (currently 1.7.2, used by Lightbox 2), CSS styles and finally Lightbox’s JavaScript code. Due to the fact I could not insert link tags directly into the content, I accomplished this by printing the code using JavaScript. The latter does not influence on having multiple jQuery libraries on a page, but it needed to be done in my case and seems a nice workaround :).

The problem was that CMS is using jQuery of version 1.5.2, but Lightbox needed 1.7.2. Because I could not upgrade 1.5.2 version to the latest, I had to separate these two libraries. I also could not just simply override the old one because other parts of CMS generated page stopped working. This can be achieved by loading the second library into a variable. Into the upper javascript I added the following:
var $jq172 = jQuery.noConflict(true);
As you maybe know, the jQuery functions can be called by $(). After this command, the formerly loaded jQuery can be accessed through $() and last added library as $jq172(). If the parameter to the noConflict function is true, the previous library is intact, otherwise it is overwritten.

Lastly I needed to apply minor change to lightbox.js to instruct the script to use 1.7.2 library. Due to the nice coding style I just needed to change the line 46 into
$ = $jq172;
By applying all these I was able to have working Lightbox 2, working previous CMS scripts and having done everything without ant CMS code change. The Lightbox library can be then used completely normally.

YouTrack4 installation on Ubuntu 12.04

YouTrack4 seems free alternative to Atlassian’s JIRA. I use JIRA on production projects and on first sight it seems far better than YouTrack4. The missing feature I immediately noticed is task time tracking and few other minor things.

Both YouTrack4 and JIRA can be hosted, but dowload versions are cheaper. The minimum version for both has upper limit defined by 10 users. YouTrack4 version is free, but JIRA costs 10$ (still very little). All bigger packages cost more money – see pricing pages.

 

As I will soon start a simple home project, I gave YouTrack4 a try. Further in this post I will describe hot to install it as a service on Ubuntu 12.04.

STEP 1: Download the complete JAR bundle from http://www.jetbrains.com/youtrack/download/get_youtrack.html.

STEP 2: Run bundle and test YouTrack4.

I copied the downloaded JAR to ~/startup/youtrack-4.02/youtrack-4.0.2.jar. Then I created script ~/startup/runYoutrack4.sh, which starts integrated Jetty webserver on port 8082:

#!/bin/bash
cd /home/slavkoz/startup/youtrack-4.0.2
java -Xmx512m -Djava.awt.headless=true -jar youtrack-4.0.2.jar 8082

STEP 3: Run script runYoutrack4.sh as root. If you can use YouTrack4 at http://localhost:8082, then continue.

STEP 4: Create init script /etc/init.d/youtrack4:

#!/bin/sh
### BEGIN INIT INFO
# Provides:          youtrack4
# Required-Start:    $remote_fs $syslog
# Required-Stop:     $remote_fs $syslog
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: Start daemon at boot time
# Description:       Enable service provided by daemon.
### END INIT INFO
#############################################################
# Init script for YouTrack4
#############################################################
# Defaults
SCRIPTNAME=/home/slavkoz/startup/runYouTrack4.sh
case "$1" in
start)
sudo $SCRIPTNAME start
;;
*)
echo "Usage: $0 start" >&2
exit 3
;;
esac
exit 0

STEP 5: Test service by issuing the command /etc/init.d/youtrack start. If YouTrack4 starts, continue.

STEP 6: Set service to automatically start at system startup using command:

sudo update-rc.d youtrack4 defaults

STEP 7: If everything went well, restart yout computer and youtrack should be accessible at http://localhost:8082.

Enjoy managing your projects …

Marinka Žitnik (sestra) – zagovor diplomske naloge

Danes je Marinka Žitnik zagovorila svojo diplomsko nalogo z naslovom “Pristop matrične faktorizacije za gradnjo napovednih modelov iz heterogenih podatkovnih virov” (A Matrix Factorization Approach for Inference of Prediction Models from Heterogeneous Data Sources), zaradi česar ji iskreno čestitam!!!

Še posebej velja poudariti, da je na dodiplomskem Interdisciplinarnem študiju računalništva in matematike na Fakulteti za računalništvo in informatiko in Fakulteti za matematiko in fiziko Univerze v Ljubljani dosegla skupaj z diplomo povprečno oceno 10.0, poleg tega pa je z vsemi vmesnimi uspehi študij zaključila v manj kot 4 letih.

Seveda sem bil na zagovoru diplome prisoten in sem ga tudi posnel (Chrome 6, Safari :), isti fullHD posnetek je tudi dostopen na: http://zitnik.si/temp/ZagovorDiplome_MarinkaZitnik_20_07_2012.mp4):


Nekaj slik z mentorjem (prof. dr. Blažem Zupanom) in komisijo:

Optilab Tech Talk – “Ontology as NoSQL Database Schema”

Today I presented a HOT topic about Ontologies and NoSQL as a Tech Talk at Optilab d.o.o.. At this company I work as a Junior Researcher and Tech Talks are our internal lectures to other co-workers. Typical lessons normally consist of something that one of us works on or he would just like to share knowledge.

The main problem I tried to address in my talk was:

  • WHAT NoSQL IS MISSING FOR GENERAL USE
  • HOW ONTOLOGY CAN HELP SOLVE THE PROBLEM

I see an ontology as an additional layer over NoSQL database. It can provide nice runtime-customizable schema and SPARQL/Update language to easily manipulate data. I believe this is especially important when combining data from different sources – after some time no one will know what relation or concept types the database contains. Another thing is SPARQL support – through an endpoint a user can run some analysis. Furthermore, when having data represented by an ontology, we can quickly change the database to another appropriate store, which cannot be so easily done between raw NoSQL – for example: try to straightforwardly transfer data from key-value to graph data store :).

FRI Summer School – “How to make your own Facebook”

I attended FRI Summer School “How to make your own Facebook” from 9th-13th July 2012.

The school was mainly by best Slovene open-source developers: Aleš Justin, Marko Lukša, Tomaž Cerar and Marko.

Initial project is available on GitHub: https://github.com/openblend. Throughout the week they presented us programming in Java EE on JBoss server 7. At first they introduced us Git versioning system, then IntellijIDEA IDE, Maven build tool. Lessons continued with CDI, JSF, Servlets, EJB3, JPA and integrated H2 SQL data store, debugging and testing techniques and lots of tricks…

At the end I also published app on OpenShift. This is scalable PaaS with three free instances (JBoss, Database, Other environment) for your own application.

The ones who have stayed at lessons until the end got free ticket for OpenBlend conference. This is Slovene Java Programming conference that will happen on 20th September 2012. I hope I will have time to attend to it.

Justin also intrduced us his work on a new language Ceylon (runs on JVM) and Google AppEngine implementation for JBoss – CapeDwarf. These are some new toys I need to try.