- Create a crawler to crawl /cache all pages on web.
- Store index cached web in Server mostly billions TB if you target all the web. try cached country database if you have some success then expand it other countries all over world.
Parser/ crwaler,index builder, query (complier,builder, macther) snippet generation.
- Filter results depending on ranking factors.
Caching, indexing perform automatically when you setup. Filtering results needs to take some quality factors like google.
if you don’t have any unique feature, it’s very hard to market your search engine.
maintenance costs are heavy.
Creating basic Search engine for your website.
You can use google Custom search to add your own search for your website. just entering your website URL in google custom search menu and embed it on your website.
Use Google Custom Search on your website (search engine).
ex: webcrawler.com uses google search, yahoo search database with API, establishing very simple.
If you are using google search as your search engine (khoj.com). you can get 50% revenue share on Ads.
pros: No caching and indexing no hosting simple script.
Cons: Search bar with “powered by google” (to remove this you need to upgrade premium).
Google search engine with your own brand (premium).
All results rankings same as google but you can replace google logo with your own legally. by paying some amount to google.
why do you not compete google?
- Google have the unique NOSQL (Not only SQL) database system called big Table which runs On HDFS system owned by google.
- HDF alternates Hadoop (file system). Hbase and hyper table databases work on NoSQL
if we unable to create a spider there is spider/ crawler available called Inout Spider
Other search engines
Yahoo and Bing are using the same database to compete google.
baidu.com a successful Chinese search engine replaced google there.
yandex.com Russian search engine.
webcrawler.com pulls results from yahoo and google (just like google custom search)
duckduckgo.com A search engine that never tracks user.
Indian search engine Not created,
Sify created khoj.com powered by google. An intermediate student from Telangana created a search engine called tsearch.in. The unique feature searches files. Now he trying to create a job website.
Top 10 search engines
- AoL search
- Web crawler
- Duck Duck go
Above search engines Mostly top in USA only.
Internet subscribers worldwide.
1st US 33 Croes (Americal population not more than 40 crores)
2nd china 50croes
3rd India 30 crores (values are Approximate.)
if anyone have strong desire to make search engine then join with them [email protected]