Regeln für Abschlussarbeiten (in der AG Fischer)

Ablauf

Wie schreibe ich eine Bachelor-/Masterarbeit?

Während des Schreibens:

  • Wenn große Rechenleistung benötigt wird: Neben den großen Rechnenknoten für Studenten (z.B. plutonium) darf man auch einen Account für einen Lehrstuhlserver beantragen. Dann gibt es noch das LiDo-Cluster.

Wie gebe ich die Arbeit ab?

  • Arbeit nach Vorlage schreiben
    • Ein Abbildungs/Tabellen/Algorithmenverzeichnis ist nicht notwendig
    • Vor dem Abgeben unbedingt das Literaturverzeichnis überprüfen
  • Deckblatt im Dekanat abholen
  • Arbeit drucken und mit Deckblatt binden
  • Arbeit im Dekanat abgeben. Außerhalb der Sprechstunden des Dekanats kann die Arbeit bei der Leitwarte abgegeben werden.

Siehe auch Lehrstuhl-weite Regelungen

Häufig wiederkehrende Fehler im Deutschen

  • Es ist kein besseres oder korrekteres Deutsch, die gebräuchlichen Relativpronomen “der/die/das” durchgängig durch “welcher/welche/welches” zu ersetzen! Bitte sehen Sie davon ab, außer Sie haben einen guten Grund dafür.
  • Die deutsche Sprache bietet i.A. keine Möglichkeit, 2 oder mehr Substantive unverbunden hintereinanderzustellen (sog. Idiotenleerzeichen wie in “die Bit Kodierung”, “der Markov Prozess” etc. Man schreibt ja auch nicht “das Mittag Essen war lecker”, “der Küchen Schrank ist offen” o.ä.). Wenn Ihnen die Zusammenschreibung zu komisch aussieht (was insbesondere in Kombination mit englischen Fachbegriffen häufig der Fall ist), verwenden Sie Bindestriche (z.B. “Suffix-Array”).
  • Auch wenn es die neue deutsche Rechtschreibung (leider) nicht mehr zwingend verlangt, erhöht es die Lesbarkeit von Sätzen ungemein, wenn erweiterte Infinitive durch ein Komma abgetrennt werden. (Wenn Sie nicht wissen, was erweiterte Infinitive sind, sollten Sie dies zum Anlass nehmen, Ihre Rechtschreibekenntnisse aufzufrischen, z.B. durch die Lektüre des Buches “Deutsche Orthografie” von Peter Eisenberg (De Gryter 2017) - in unserer UB elektronisch erhältlich).
  • Englische Fachbegriffe nicht zwingend übersetzen! Aber wenn Englisch, dann kursiv und mit korrekter Groß-/Kleinschreibung (z.B. “Das \textit{sparse suffix sorting}”).
  • In Zweifelsfällen konsultieren Sie bitte eine online-Wörterbuch, z.B. das von Duden

General Rules (by Dominik Köppl)

Spell Checking

  • Use automatic spell checking, e.g. aspell!
  • Ask your mom or girl-friend/best friend to spell check, if applicable.

You have to check manually:

  • doubly appearing words like 'and and'
  • tenses, etc.

Style

  • keep sentences as short as possible, as eloquent as necessary
  • Prefer active over passive form:
    • bad: “One creates a tree …”
    • bad: “A tree is created for…”
    • good: “We create a tree…”
  • write output-driven/result-orientated, not a diary or a log-book
  • I want to see what your motivations are and why you write this
  • do not talk about unnecessary stuff (e.g., historic facts, what a red-black tree is…)
  • use scientific resources as literature (not Wikipedia)
  • avoid inner clauses:
    • bad: 'Exporting data from the main memory to the hard drive is, due to the lower access rates, very inefficient.'
    • good: 'Due to the lower access rates, exporting data from the main memory to the hard drive is very inefficient.'
  • Footnotes must not contain essential information that is needed to understand the text. Footnotes should be used sparsely.
  • When measuring time/space, always add units.
    • bad: 'The quickest runs in O(n lg n)'

Is it O(n lg n) time or space? If it is space, is it O(n lg n) bits, O(n lg n) bytes or O(n lg n) words?

Notation for Stringology Articles
  • σ : alphabet size
  • Σ : alphabet
  • T : the main text
  • n : the length of T
  • P : a pattern
  • m : the length of P
  • \ell for length or leaves in trees
  • i,j,k for counting variables or positions
  • small letters/greek letters for number variables
  • large letters for strings, substrings, factors, arrays, data structures like heaps
  • use meaningful variable names like Q for queue, H for heap, T for the main text
  • When bounding the domain of an integer, prefer ≤ to <, i.e., instead of 0 < i < 5, write 1 ≤ i ≤ 4.

Typography

  • use always the same font family
  • prefer suitable highlighting by making the font italic,bold,teletype or sans-serif instead of using “”
  • highlight keywords when they are introduced, e.g., with \emph{…} or \textbf{\emph{…}} or with color.
  • “Hard-coded” letters or example strings are usually put in \texttt{…}
  • normal text in the math-environment is put into \text{…}
  • If you add a citation reference at the end of the sentence, add it before the full stop. Like: This is always true [4].
  • Wrap function names in a text-environment like \textup{…} or \textsl{…} when used in math mode
  • Paragraphs should contain at least two sentences. A list of paragraphs with one sentence looks very rugged.
  • Add text between \section{…} and \subsection{…}

Figures, Pictures and Tables

  • It is appreciated to draw pictures by your own. If not, always cite from where you copied the material. There is no exception when the material is public domain/open source, etc.
  • Pictures support your explanations, so add them when applicable.
  • Use only vector graphics, or if unavailable, use raster images with high resolution
  • Check with a gray-scale printer whether your pictures look nicely
  • Your picture descriptions should not contain information about the coloring used
  • instead of taking plain red,blue,green,yellow, there are plenty color schemes available (e.g. google color or color brewer)
  • Captions should be written in a unified form. For instance, start with the title of the figure (i.e., not a whole sentence). Optionally, it is followed with full sentences explaining the figure. Be concise with the full stops in the caption.
  • Only tables showing real data (like time/space in experiments) are labeled as “Table”. A table used as a form of visualization is still a “Figure”.

Pseudo Code

  • algorithm2e with \usepackage[linesnumbered,ruled,vlined]{algorithm2e} can produce nice, compact pseudo code
  • it is possible to add colored syntax highlighting to your pseudo code
  • pseudo code should not replace the description of an algorithm. An algorithm should be described in such a way that it understandable even if the pseudo code is missing. So pseudo code is only an additional aid for understanding what is going on.

References

  • Every reference put in the list of references should appear in your main text. Use them where they apply!
  • Each reference needs a minimum set of properties, based on the media:
    • Websites: URL, title, authors (if known), the day when did you accesses this website (a formal date)
    • Proceedings: authors, title, conference/journal name, year, pages, publisher, series (if available), volume (if available).
    • Book: authors, title, year, publisher, edition.
  • All references have to be unified, including
    • Author names (order of first and family name)
    • Full name or abbreviation of a conference
  • When citing books or larger articles, add page numbers or the specific theorem.
Examples

for a conference article use inproceedings

  • title: Watch out that proper names (like of persons) are capitalized (achieved by adding {} around the first letters)
  • booktitle: conference name or its abbreviation. long version: “In Proceedings of …”
  • if the proceedings have a series (like LNCS of Springer, add series and volume)

Caveats

  • title: Watch out that proper names (like of persons) are capitalized (achieved by adding {} around the first letters)

for a conference article use inproceedings

  • booktitle: conference name or its abbreviation. long version: “In Proceedings of …”

for a journal article use article

  • journal: journal name or its abbreviation.
  • Additionally use the number-entry
@inproceedings{lzciss,
  author    = {Johannes Fischer and Tomohiro I and Dominik K{\"{o}}ppl},
  title     = {{L}empel-{Z}iv Computation in Small Space ({LZ-CISS})},
  booktitle = {Proc.\ CPM},
  publisher = "Springer",
  pages     = {172--184},
  series    = {LNCS},
  volume    = {9133},
  year      = {2015}
}
@article{cohen10fast,
  author =   {Hagai Cohen and Ely Porat},
  title =    {Fast Set Intersection and Two-Patterns Matching},
  journal =  {Theor.~Comput.~Sci.},
  volume =   {411},
  number =   {40--42},
  pages =    {3795--3800},
  year =     {2010}
}

Please obey these rules strictly. Most free citation services like Google scholar, DBLP, Citeseer, etc.,

  • omit attributes,
  • do not write “Proc\. ” or “Proceedings” for the proceedings of a conference, and
  • add additional attributes like DOI, URLs, month, the name of the editors, etc., which we do not want to see

See also Regeln für Zitate und bibtex-Einträge

Latex

  • Citations
    • bad 'We use a cool data structure.\cite{coolGuy}'
    • better 'We use a cool data structure~\cite{coolGuy}.' (cite before full stop, add tilde to prevent hyphenation)
  • Cite with Reference
    • bad: 'We do that like~\cite{coolGuy}.' (missing object)
    • good: 'We do that like Cool Guy~\cite{coolGuy}.' (add author name(s))
    • alternative: 'like \citet{coolGuy}.' with natbib package
  • References
    • bad '… see Fig. \ref{coolFigure}.'
    • good '… see Fig.~\ref{coolFigure}.' (add tilde to prevent hyphenation)
    • alternative: '… see \Cref{coolFigure}.' with cleveref package
  • Images
    • use tikz to write your images directly in latex
    • use ipe or inkscape for easy vector graphics drawing
  • Typography
    • use math-environment for variables
      • Difference between \(A_v\) and \(A_{\text{ v}}\) is that the former is a variable *parametrized* by v and the latter is a variable which has a v in its subscript as a name.
      • If a variable is called foo, then write in math-mode \(\text{foo}\) instead of simply \(foo\), because \(foo\) looks like f $\cdot$ o $\cdot$ o.
    • Keywords can/should be highlighted with \emph{.} or \emph{\textbf{.}}

Git

We can support your thesis by a git repository. Just send me your public SSH key so that you can access your fresh created repository. If you intend to write a thesis about compression, think about using our compression framework. We maintain our framework at the ITMC's repository hosting service. You need to register first at this site before I can add you as a member.

Supplementary Material

Programming

  • comments are always in english
  • write test-driven
  • learn how to benchmark
    • statistics like significance, median, arithmetic mean, etc.
    • number of runs of an experiment dependent to expectation value, standard deviation
  • head for large input values! If your code works with 640KB, can it cope (at least) with a simple human genome taking approx. 3GB, too?
  • Learn how to write unit tests and how to log. Do not use the standard output for logging!
  • Use large data sets for testing like the Pizza&Chili Corpus

C++

Courses/Material

Tools:

  • recommended compilers
    • g++ and gdb,
    • clang and lldb
  • use cmake as a build tool
  • find bottlenecks with gprof
  • use valgrind to find memory leaks, and to analyze memory consumption
  • evaluate your memory usage with malloc count
  • use gcov to find dead code (code never executed) and to ensure that your tests cover your source
  • for benchmarking your project, use the flags '-O3 -DNDEBUG', for debugging your project, use '-O0 -ggdb' or '-Og -ggdb'. In order to debug your program plus a library, compile the library with the flags '-O0 -ggdb' and drop the flag '-O3'.
  • do not use 'cout' for logging! Use glog. For instance, to test for the invariant a < b write the macro DCHECK_LT(a,b).
  • write tests with gtest
  • benchmark speed with celero

Programming Etiquette

  • prefer uint64_t to int if you want an unsigned integer taking 64 bits.
  • avoid casts; C-casts, escpecially dynamic_casts can take time
  • avoid class hierarchies (like in Java) since downcasting references/pointers take time

Interesting data structure libraries

Caveats

  • std::vector uses two size_t variables storing its physical size (called reserved size) and the number of elements it contains (actual size). You have to call reserve(n) to fix the size, while resize(n) allows you to change the number of elements. For freeing all memory, you have to call clear() and shrink_to_fit().

Java

Tools for String Analysis

Howto prepare slides

  • maximize: illustrations, examples, motivation
  • minimize/drop: formulae, whole sentences
  • I do not recommend using beamer/latex, especially with some standard template.

See also additional tips in German

 
Last modified: 2018-10-18 08:55 by Johannes Fischer
DokuWikiRSS-Feed