We seek a highly motivated data engineer to help with developing custom tools to facilitate our research, focusing primarily on web scraping and web crawling. You will work closely with senior researchers to create tools to extract and ingest data in multiple languages and to refine and adapt these tools for different research projects. You will also be responsible for developing storage and retrieval solutions to manage large volumes of data efficiently. Reading proficiency in Chinese is a plus but not required.Responsibilities Include:Work with researchers to develop custom web scrapers and crawlers to facilitate automated data collection for ongoing client projectsImplement mechanism for efficiently storing and organizing scraped data, making it readily retrievable for researchersAdvise company management on data best practices, and propose solutions to data-related challengesDesired Knowledge, Education, and Experience:Solid experience with PythonFamiliarity with techniques and tools for crawling, extracting, and processing data (i.e., Scrapy, Pandas, MapReduce, Beautiful Soup, etc.), and experience running large-scale data scraping projectsDatabase development experience, focused on high volume data storage and retrieval applicationsFamiliarity with Linux/UNIX, HTTP, HTML, JavaScript, and networkingExperience with applications designed to display archived web contentGreat communication skillsKnowledge of which tools are suited for a given task, but also the ability to think of solutions beyond those best practicesBachelor’s degree in Computer Science or a related field, or the equivalent demonstrated experienceThe ability to work 15-20 hours per week with flexibility for additional hours, as requiredThe Data Engineer is a remote position with flexible scheduling. This job description intends to describe the general nature of the job and does not represent that all individuals who hold the position will perform all duties set forth above.
Data Engineer
28
Sep