Monday, October 2, 2017

Monday, August 14, 2017

copy of the Letter sent to the Minister of Patents in India about their website

Dear Nirmala Sitharaman,
I am writing to you to bring to your cognizance an issue that might require policy change with Indian Patent record system that is hampering researchers like me. I am writing to you to propose that there might be a policy rethinking required on broader scale on how knowledge sharing will remain open and easily accessible to a global audience and researcher on the Indian Patent website.

I am a Webmaster by trade and an Independent Researcher. I am a native New Yorker and graduate of NYU Tandon (in Downtown Brooklyn). I am working on a research project that compares Open Source Patents focused on a programming language called Python. I have a lot of time that I have placed in troubleshooting with my Python programmer a way in which to scrape text based data from your site with little success.  My interest is on Open Source technologies and Patents in Python.

I also saw that the Indian Patents infrastructure is not a part of Google patents and doesn’t support scraping leave aside any advanced scraping support which is offered on US patent websites (such as queries to sort patents). This came as a surprise to me considering that I would believe India to be the second most open place (outside USA) for sharing information and supporting Open Source culture. This is another area that needs a new way of thinking and reform.

Additionally, I would like to know if it is possible to buy the patent data on a CD or packet from the patent office? But I didn’t find an option to do that. My intent is to do text analysis to see patent growth in a particular field of technology.
For example, dates that I am interested in are August 1 through the 12th of 2017. We are researching a comparison of the Patents granted by the USTrademark Office website records and the Patents granted by your office on the same dates. We are in the process of blogging this activity as it unfolds so will be including this letter as part of our research activity. I’d like your suggestion on the best way to approach this inequity and furthermore if there are opportunities to be a part of the change toward betterment of overhauling the way it is currently set up.

API link for the USTrademark Office Patent Information:

I thank you for your time and attention,

Diane Ludin

News story on Farmers "proper seed use"

This one dated today August 14th 2017:

Thursday, August 10, 2017

Notes on India's website repository for Patents 8-10-2017

Some notes on the difference between the websites that house the patents for India and the US.

A simple title search (which is possible in the US Patent Office website generates an error) in India’s online system. I therefore have to search other aspects of Patents on the Indian trademark office to achieve results for patents that are related to ‘open source’ and ‘python’. Also the results of the India Patent office are not directly linkable as they are in the USTrademark Office’s website. Therefore, searching and scraping of India’s patents system results cannot be scraped in the same way as the US Patent Office’s site.

Additionally, the results seem to be not related to software development. Most of the results are about data science, machine learning and analytics. There are many more records received from an initial search with the terms “open-source” + “python” in the Description area/section of the Indian patent office records.

Question: What is a better way to retrieve granted patents in India’s patent office website?  
Answer: Probably developments in new scraping methods in python can retrieve and  scrape India’s patent office website.

Question: How can we bypass the captcha system that the Indian patent website has?
Answer: We need to work on different scraping methods in python and how they can be used for scraping along with their pros and cons is a thing to look at. This is one of the technical obstacles we face in transferring our existing php technology that scrapes the US trademark office’s website.

Indian patent site is not friendly for scraping the way that we have executed scraping for the US Trademark office’s site. The Indian patent office website uses JavaScript to produce the results. We need to explore more ways we can rebuild our algorithmn specific to the Indian Trademark Office’s website. I will be conferring with my PHP Developer to get his suggestions on how to approach the Indian Trademark Office’s Site.

Beautifulsoup python is the function that Joshi has found to filter patents from the USTrademark Office. Beautifulsoup python is not able to scrape the Indian Trademark Office’s Site because data is not on an html page but produced through a Javascript. We may need to research asynchronous web processes to speak with the Javascript that is used on the Indian Trademark Office’s Website.

Question: Where do we start?
Answer: Joshi has located the officer of the Indian Trademark Office’s email address. Diane will craft a letter to the Indian Trademark Patent Officer requesting access to the patents website. How can we access the patent records published in the database? Joshi will review then we will send it off and see what results we can receive. In the interim we will manually download the Abstract and the Claims of the Patents granted in India. 

Right now, google patents do not include patents from the Indian patents website. Additionally, captcha is installed on the Indian patents website to deter scraping.

Friday, July 7, 2017

For Your Information (FYI)

video on how to search for a patent:

googlepatents video:

Suburban Tool Inc: should you patent your product or idea?

Announcing our github account link

Here is our Github account that we are using to share Django and Python files in progress. Here is the address: