Showing posts with label AbiWord. Show all posts
Showing posts with label AbiWord. Show all posts

Thursday, 16 August 2012

GSoC in AbiWord : Final Report

And here comes the end of this season of Google Summer of Code, it has been simply awesome participating in the program and moreover being a part of the awesome AbiWord community.

About the Project: The OpenDocument Math filters and MathML to LaTeX conversion were the major focus of my project before the MidTerm, where as Post Midterm I majorly took up the Math import and export filters for the DOCX (OOXML) format. Here is a brief summary of the work done post MidTerm :

1. Implemented OMML to MathML & MathML to OMML converters, as AbiWord and Word use MathML and OMML for storing the Math respectively.

2. Created the Math element and Math ListernerState in the OpenXML plugin to implement the import & export of Math.

3. Completely implemented the import/export the Math from/to docx. AbiWord can now read and edit the Math exported by MS Word and similarly MS Word can read and edit the Math exported from AbiWord.

4. With this we now have full Math support for both odt and docx formats \m/

5. Fixed various Windows build specific bugs like the build errors in the Opendocument, MathView plugins. Making the Windows build completely error free, in order to do the next development release.

6. Sorted out various Windows Installer Languages issues, making the Windows Installer much better in terms of localization.

7. Squashed a few bugs listed to be 3.0 blockers and working on others as well.

I'm again quite happy with all the work and the learning that has come out of it. It has been a truly wonderful Summer and I plan to continue to work on AbiWord and contribute as much as possible.

I would like to take this opportunity to thank Google for this wonderful program and initiative. I'll also like to thank my awesome mentor Jean Brefort and other AbiWord developers Marc, Hub, Martin, Chris & Pradeeban for being so helpful and supportive. It's been a pleasure working on AbiWord and I hope to continue to do so.

Sunday, 8 July 2012

GSoC in AbiWord : Midterm Report

Time truly flies & it's Midterm already ! It has been a truly awesome experience working on AbiWord till now and I hope it continues the same way.

The progress made and the work done till now is as follows:

1. Implemented the MathML to LaTeX Converter for the MathML import either directly from MathML/XML or from ODT. It will also be useful for the Math import from DOCX which is going to be the next step.

2. Now we can edit MathML inside AbiWord \m/, which means we can edit the Math Equations from ODT inside AbiWord, as well as those from the MathML/XML files.

3. Fixed Math Object Import (MathML Import) for opendocument which was broken.

4. Fixed errors in Equation Insert from MathML for Windows.

5. Fixed libxslt related issues in the LaTeX plugin, which used to cause a loss of functionality of the plugin especially in Windows.

6. Removed quite a few Windows Build related errors and also implemented the perl script for automatic conversion of PO files to Strings for Windows.

I'm happy with the work done till now and the fact that all the milestones planned till the Mid-Term have been achieved. It has been wonderful working with the AbiWord Community who have been very helpful, especially my Awesome Mentor Jean Brefort.

The next step now is to implement the Math import/export from/to docx.

Sunday, 3 June 2012

XML & it's use in AbiWord

Extensible Markup Language (XML) is a markup language which defines a set of rules for encoding documents in a human & machine readable format. It is a markup language much like HTML but with a completely different goal, HTML was designed to display data with a focus on how data looks whereas XML is about transporting and storing data with a focus on what the data is. Unlike HTML, there are no tags defined for XML and it is designed to be self descriptive. Since it allows users to define their own tags, there is a data definition table (DTD) required to decode the data, which is defined near the top of the file.

Some common constructs that appear in XML:

XML Declaration : <?xml version="1.0" encoding="UTF-8"?> , this is XML declaration which is not required but it identifies the document as XML and indicates the version of XML.

Character : Any XML document is a string of characters and almost every legal Unicode character can appear in it. All XML processors must be able to read entities in both the UTF-8 & UTF-16 encodings.

Markup and Content : The contents in an XML document are divided into Markup and Content, which are distinguished by simple syntactic rules. Like all strings which constitute Markup either begin with the character < and end with >, or begin with & and end with ;  And the strings of characters which are not Markup are the Content.

Tags are the markup construct which begin with < and end with > (it can be a start-tag <block> , end-tag </block> or a empty-element tag <line-break />). And the the document component which starts with a start-tag and ends with the end-tag or consists of an empty-element tag is called Element. And the content within the tags are Element’s content which might contain child elements as well.

Attributes : another markup construct which contains the name/value pair

The processor analyzes the markup and passes the structured information into an application. This processor is often called an XML parser. Many word processing programs have XML as their native document format for e.g. our very own AbiWord (.abw documents are XML)
 
XML in AbiWord : AbiWord uses a straightforward XML document format in which appearance and layout are specified in CSS-like attributes but only as a starting point. An entire XML source of a document (sample.abw) created in AbiWord which contains the text “AbiWord Rocks!) looks like :

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE abiword PUBLIC "-//ABISOURCE//DTD AWML 1.0 Strict//EN" "http://www.abisource.com/awml.dtd">
<abiword template="false" xmlns:ct="http://www.abisource.com/changetracking.dtd" xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:math="http://www.w3.org/1998/Math/MathML" xid-max="2" xmlns:dc="http://purl.org/dc/elements/1.1/" styles="unlocked" fileformat="1.0" xmlns:svg="http://www.w3.org/2000/svg" xmlns:awml="http://www.abisource.com/awml.dtd" xmlns="http://www.abisource.com/awml.dtd" xmlns:xlink="http://www.w3.org/1999/xlink" version="0.99.2" xml:space="preserve" props="dom-dir:ltr; document-footnote-restart-section:0; document-endnote-type:numeric; document-endnote-place-enddoc:1; document-endnote-initial:1; lang:en-US; document-endnote-restart-section:0; document-footnote-restart-page:0; document-footnote-type:numeric; document-footnote-initial:1; document-endnote-place-endsection:0">
<!-- ======================================================================== -->
<!-- This file is an AbiWord document.                                        -->
<!-- AbiWord is a free, Open Source word processor.                           -->
<!-- More information about AbiWord is available at http://www.abisource.com/ -->
<!-- You should not edit this file by hand.                                   -->
<!-- ======================================================================== -->

<metadata>
<m key="abiword.generator">AbiWord</m>
<m key="dc.creator">Prashant</m>
<m key="dc.format">application/x-abiword</m>
</metadata>
<rdf>
</rdf>
<history version="1" edit-time="14" last-saved="1338761497" uid="0e329dea-adc9-11e1-9005-9b3eee35aa57">
<version id="1" started="1338761497" uid="16910e40-adc9-11e1-9005-9b3eee35aa57" auto="0" top-xid="2"/>
</history>
<styles>
<s type="P" name="Normal" followedby="Current Settings" props="font-family:Times New Roman; margin-top:0pt; color:000000; margin-left:0pt; text-position:normal; widows:2; font-style:normal; text-indent:0in; font-variant:normal; font-weight:normal; margin-right:0pt; font-size:12pt; text-decoration:none; margin-bottom:0pt; line-height:1.0; bgcolor:transparent; text-align:left; font-stretch:normal"/>
</styles>
<pagesize pagetype="Letter" orientation="portrait" width="8.500000" height="11.000000" units="in" page-scale="1.000000"/>
<section xid="1" props="page-margin-footer:0.5in; page-margin-header:0.5in">
<p style="Normal" xid="2"><c>AbiWord Rocks !</c></p>
</section>
</abiword>

The inherent readability of XML makes the interchange and format specification quite easier. Apart from AbiWord, other formats like the Open Document Format  as mentioned in the previous post is XML-based , whereas Microsoft office uses OOXML (a zipped XML based file format) as its default format now.

Monday, 23 April 2012

and here it comes... GSoC 2012 - Accepted \m/\m/\m/

After a lot of wait and anxious moments, here it comes... I got selected for GSoC 2012. I'll be working for AbiWord, the supercool cross platform open source word processor under the mentorship of Jean Brefort. And my project is to "Implement and Improve the import and export of math from/to odt, doc & docx formats".

A total of 6 students were selected for AbiWord this year :



After I get done with my end semester exams by 2nd of May, I plan to get into the action with full energy and not only complete my project but contribute as much as possible and in the process learn as much as i can.

Looking forward to a Summer full of learning, fun, excitement and a lot of code :)

Friday, 13 April 2012

Let the Fun begin !

This blog is aimed at keeping track on my open source ventures. I've always been awed by the concept of FOSS (free and open source software), but never actually got my own hands dirty. But now with the summer holidays and my ever growing passion in programming, I'm in and I'm in for good.

I'm starting out by working for an awesome open source cross platform word processor AbiWord. I've been aware of its existence and I've actually seen people use it in many low config PCs (those which couldn't afford the heavy requirements of MS Office) but never really contributed to it but in the process of applying for GSoC 2012, I'm looking into it quite deeply. I've worked on bugs & created a few patches, essentially getting a flavor of the code.

Truth be told, I'm totally impressed by how things work in the AbiWord community, with so many people from different continents working in collaboration. Such is their dedication that even after keeping full time jobs and families they spend a lot of time hacking for AbiWord and that too all voluntarily, that i think is the beauty of open-source. And I'm loving it and i think this is something I'm going to sink right in.

The Judgement Day (23rd April) - the day GSoC result comes out (eagerly waiting for it :)), I've applied for the project of improving the math import/export in Abiword with the center of attraction being the MathML to itex convert as AbiWord uses itex as its Math Composition Language. For instance currently AbiWord can import the MathML of odt but we can't edit it inside AbiWord.

Even though getting selected will be a great honor and responsibility, i plan on to dropping all my other internship options (foreign interns - I'm sorry !) and do this no matter what and use this blog to keep a track of my work.