Abyssinica – The Amharic Search Engine

A search engine is a software system that searches the data-source for information based on search terms. The data-source could be web, database, or set of documents. Search engines use complex algorithms to take your search query and return results that are usually quite relevant to your search. Popular examples of search engines are Google, Bing, and Yahoo!.

Search engines are answer machines. When a person performs a search, the search engine scours its corpus of documents and returns only those results that are relevant or useful to the searcher's query and ranks the results according to the popularity and relevancy factors.

There are many factors influence relevance, relevance aren’t determined manually. Instead, the engines employ mathematical equations (algorithms) to sort the relevant content from the chaff, and then to rank the content in order of quality.

Search engines apply different language processing tools to analyze the content written in specific language. For example, “የሕንጻ አወቃቀር” and “የግንብ አሰራር” are literally different but conceptually similar. Search engines understand such conceptual similarity while analyzing documents. For Semitic languages like Amharic, language morphological analysis is extremely complex. For example, in English language, run, runs, ran, running, runner are forms or derivatives of the same lexeme ‘run’ and these derivatives of an English lexeme are way less than hundred in number. Where as in Amharic, a single word could be derived to thousands of other words. Eg. ባይሳካልኝም (even if I am not successful) could be derived from the word ስኬት (success). ስኬት is derived from ሰካ (ሰካ -> ተሳካ -> ስኬት). A translation of a single Amharic word could be a sentence in English. As of now, none of these popular search engines properly analyze Amharic. That is why we invent Abyssinica!

Today, because of lack of the ability to filter Amharic contents, Search Engines provide search results which are not age or culture-appropriate to users regardless of your search settings. Sadly, irrelevant Amharic contents (including dirty words) have been indexed and well ranked by such search engines.

Abyssinica Search Engine utilizes our home made Amharic Language Processing tool for better content analysis, and aims to eventually clean the Amharic web search by providing Ethiopian culture, and age appropriate search results.

Abyssinica Custom Search for websites provides relevant and contextualized information to your website users. For example, If your website user searches for “የሕንጻ አወቃቀር" and if your website does not have any content about ‘ሕንጻ’ but a page about “የግንብ አሰራር”, then the page about “የግንብ አሰራር” will be provided as a search result. Please see Abyssinica Custom Search in Action on AmharicBible.org.

Apart from being a search Engine, Abyssinica is capable of analyzing two or more documents for content similarity. Abyssinica will tell you if two Amharic documents are talking about the same concept or not. With the help of EthiopicOCR, Abyssinica can analyze and apply full text search on paper documents, pdfs, and scanned documents. Abyssinica Dictionary for Web, Mobiles, and Microsoft office are other related products available for free.

Although Abyssinica is capable of crawling and analyzing the entire Amharic web content in the world like Bind and Google do, we are very careful about the content convenience and usefulness to users, since the internet is not rich enough in Amharic. At this time our higher priority is to focus on providing knowledge resources to our users. So, currently, Abyssinica crawls and analyzes few sites. If you are a website owner and you feel your website could be a knowledge source for Ethiopians and Amharic speakers, please contact us to include your website in the Abyssinica Search.

