
Færdiggjort
Slået op
Betales ved levering
Hindi and Indonesian Safety Hardening and Safety Dataset - Annotation 1. Annotation Requirement Description This annotation task aims to construct safety datasets for Hindi and Indonesian through manual annotation. 1.1 Basic Task Information Task Summary: Annotate five types of raw data (sensitive words, text samples, image samples, "image-text" pairs, "video-text" pairs) in Hindi and Indonesian according to requirements. Deliverable Types and Formats: a. Sensitive Words: Words, phrases. Delivered in Excel and JSONL formats only. b. Text Samples: Sentences, paragraphs. Delivered in Excel and JSONL formats only. c. Image Samples: Images in JPG or PNG format, stored in folders. Deliver Excel, JSONL, and corresponding attachment folders. d. "Image-Text" Pairs: Consist of an image and matching text, stored in folders. Deliver Excel, JSONL, and corresponding attachment folders. e. "Video-Text" Pairs: Consist of a video and matching text, no specific video format requirement, stored in folders. Deliver Excel, JSONL, and corresponding attachment folders. Languages: 2 languages: Hindi and Indonesian.
Projekt-ID: 40218909
32 forslag
Projekt på afstand
Aktiv 23 dage siden
Fastsæt dit budget og din tidsramme
Bliv betalt for dit arbejde
Oprids dit forslag
Det er gratis at skrive sig op og byde på jobs

Drawing on my extensive experience in data analysis and handling, I am confident I can successfully fulfill the annotation tasks for your Hindi and Indonesian safety datasets. Ranked among the top 0.03% of all freelancers, I have a proven track record of providing high-quality, error-free work within set deadlines. With my proficiency in Python and Excel, I will ensure that your deliverables are presented in both Excel and JSONL formats as required, with all relevant information meticulously organized. Moreover, my ability in processing, scraping and managing large data sets guarantees an efficient yet accurate completion of the job. And given the sensitive nature of these annotations, my utmost priority will be to handle them with the necessary tact and confidentiality. In conclusion, choosing me will not only give you access to my technical expertise but also my dedication to producing precise and value-driven outputs. I look forward to discussing this project further with you and ensuring that you receive results that meet your requirements and exceed your expectations. Let's foster a long-term working relationship as I am open to assisting you in future projects too!
$100 USD på 60 dage
6,0
6,0

I’m an Indonesian native with prior experience in safety-focused data annotation. I’m familiar with content classification (harmless, positive, negative, boundary vs non-boundary) and can follow detailed annotation guidelines with high accuracy.
$100 USD på 60 dage
7,0
7,0
32 freelancere byder i gennemsnit $1.938 USD på dette job

Dear Client, Would you like to see a preliminary demo of our Hindi-Indonesian safety data annotation solution before making any commitments? Our expertise in data annotation ensures precision in identifying sensitive words, text, images, and multi-modal pairs, all tailored to your project requirements. Let’s discuss how our solution can enhance your safety dataset construction and arrange a demo to illustrate our capabilities in action. Regards, Smith
$2.250 USD på 7 dage
7,1
7,1

With over 10 years of experience in web and mobile development, including expertise in blockchain and AI/ML, I understand your need for constructing safety datasets for Hindi and Indonesian through manual annotation. Your project requires annotating sensitive words, text and image samples, as well as "image-text" and "video-text" pairs in both languages. I have a proven track record in delivering tailored solutions for various industries like fintech, eCommerce, and blockchain, ensuring successful project outcomes. My experience in building scalable and feature-rich solutions aligns perfectly with the requirements of your project. If you're looking for a reliable and experienced developer to handle your Hindi-Indonesian Safety Data Annotation project, I am here to assist you every step of the way. Let's discuss how we can collaborate and achieve your project goals seamlessly.
$2.400 USD på 30 dage
6,4
6,4

⭐⭐⭐⭐⭐ CnELIndia and Raman Ladhani will execute a structured safety-annotation workflow covering sensitive words, text, image, image-text, and video-text datasets in Hindi and Indonesian. Step 1: requirement mapping, taxonomy setup, and safety guideline alignment. Step 2: recruit and brief bilingual annotators with domain-specific training led by Raman Ladhani to ensure cultural and linguistic accuracy. Step 3: collect and preprocess data through secure pipelines, including web scraping, filtering, and metadata structuring. Step 4: multi-layer manual annotation with quality checkpoints, conflict resolution, and safety validation. Step 5: standardized data formatting and management using automated Excel and JSONL generation, verified attachments, and organized folder structures for media. Step 6: continuous QA audits, progress tracking dashboards, and compliance reviews by CnELIndia project leads. Step 7: final validation, clean delivery, and documentation ensuring scalable, consistent, and reliable safety datasets ready for deployment.
$2.250 USD på 7 dage
5,9
5,9

As an Excel virtuoso with a keen eye for detail and an unwavering commitment to efficiency, I'm confident that I can deliver unparalleled quality in the manual annotation task at hand. With my extensive experience in data analysis and processing, specifically leveraging Excel, VBA and Google Sheets to their full potential, I guarantee a smooth and streamlined process that aligns readily with your project's requirements. Having successfully completed 1000+ similar projects, I have developed a refined methodology that ensures not only accuracy but also compatibility with different data formats. Whether it's sensitive words, text samples, image-text / video-text pairs or others, I will organize and deliver them in the precise formats you require (Excel, JSONL) without compromising on the quality of the annotations. Choosing me means choosing a highly responsible individual with invaluable skills to handle your delicate project. Trusting me with this may be your best decision yet. Let's collaborate and build excellent safety datasets for Hindi and Indonesian together!
$2.250 USD på 7 dage
5,7
5,7

Hello client, I’ve carefully reviewed your job description and have strong experience in these Data Processing, Data Entry, Data Collection, Data Management, Data Delivery, Web Scraping, Excel and Data Analysis. I can build a reliable web scraping solution tailored specifically to your needs. Whether using Node.js with Puppeteer/Cheerio or Python with Selenium/BeautifulSoup, I will extract, clean, and organize your data efficiently. I also handle anti-bot protections, pagination, and full automation as required. As you can see from my profile, my web scraping reviews are excellent, reflecting my commitment to quality work. I focus on writing clean, maintainable, and scalable code because I know the difference between 99% and 100%. If you hire me, I’ll do my best until you’re completely satisfied with the result. Let’s discuss your target website and preferred data format. Thanks, Denis
$1.700 USD på 15 dage
5,1
5,1

Hello there, I have done similar work earlier for my many client's with 100% satisfaction, INBOX ME FOR DETAILS, I can start your project right now with 100% accuracy and within time deadlines. I am happy to provide you a sample work. Please let me know. Thank you.
$1.500 USD på 3 dage
5,2
5,2

Hi, I have experience creating structured, high-quality datasets and can help annotate Hindi and Indonesian content across text, images, and video efficiently and accurately. I’ll follow your specifications for sensitive words, text samples, image-text, and video-text pairs, delivering everything neatly in Excel and JSONL formats with properly organized folders for attachments. My approach ensures consistency across annotations, with attention to language nuances in both Hindi and Indonesian, so the datasets are reliable for downstream AI or analysis tasks. I can also implement a simple QA step to catch errors before delivery, saving time in revisions. With clear communication and a structured workflow, I’ll make sure your safety dataset is ready for immediate use without headaches. Please get in touch. Best
$2.600 USD på 15 dage
4,6
4,6

Hello I specialize in AI data services & safety hardening pipelines for multilingual datasets (NLP + Vision). I’ve delivered large-scale manual annotation + compliance labeling for sensitive content across text, image, and video formats—meeting strict schema, QA, and delivery SLAs. What I’ll Deliver (as per spec): ✔ Sensitive Words, Text, Images, Image-Text, Video-Text ✔ Formats: Excel + JSONL + structured folders ✔ Languages: Hindi & Indonesian ✔ 95%+ accuracy with double-pass QA + boundary checks Techniques & Workflow: Source & Scrape: Ethical web scraping + dataset balancing Annotation SOPs: Label L1/L2/L3, boundary vs non-boundary QC Pipeline: Inter-annotator agreement, spot audits Automation: Python validators for schema, duplicates, UUIDs Secure Delivery: Batch delivery, encrypted handover Tools: Excel, Python, JSONL validators, Web Scrapers, QA Dashboards Relevant Projects: • Multilingual Content Safety Dataset (Hindi/EN) – 85k items • Vision-Text Moderation Dataset for AI Trust & Safety • Social Media Risk Labeling Pipeline (Text/Image/Video) I can show demo code + sample outputs today. Let’s finalize scope, milestones, and pricing and start immediately.
$3.000 USD på 30 dage
4,6
4,6

Dedicated Freelancer Ready to Elevate Your Project for Hindi-Indonesian Safety Data Annotation. I have a solid background in Data Collection, Data Delivery, Data Management, Web Scraping, Excel, Data Processing, Data Entry and Data Analysis, I bring valuable expertise to your project. I have successfully completed many projects with 100% client satisfaction. Clear and timely communication is my priority. I believe in keeping you informed throughout the project lifecycle. I am available for a discussion at your earliest convenience. Please feel free to contact me to further discuss your project details. Thank you for considering my bid. I am excited about the opportunity to contribute to the success of your project. Please visit my portfolio to check my previous work samples, here - https://www.freelancer.com/u/GraphicsHub2k24?page=portfolio&w=f&ngsw-bypass= Best regards, Muhammad Asim Khan
$1.500 USD på 1 dag
4,3
4,3

Greetings! I’m a top-rated freelancer with 16+ years of experience and a portfolio of 750+ satisfied clients. I specialize in delivering high-quality, professional Hindi-Indonesian safety data annotation services tailored to your unique needs. Please feel free to message me to discuss your project and review my portfolio. I’d love to help bring your ideas to life! Looking forward to collaborating with you! Best regards, Revival
$1.500 USD på 14 dage
4,2
4,2

I understand your need for a Hindi-Indonesian safety data annotation project described in detail. My experience in Data Processing, Data Entry, Data Analysis, and more aligns perfectly with your requirements. With my skills and expertise, I can annotate the five types of data you specified efficiently to build high-quality safety datasets. My background ensures I deliver accurately annotated sensitive words, text, images, and pairs compiled in convenient Excel and JSONL formats and folder attachments. Additionally, I excel in tasks similar to constructing safety datasets. Despite exercising reduced introductory rates, you will benefit from dedicated work and premium outcomes. Your search for quality annotation ends here. Have confidence in collaborating with me to achieve your project goals effectively. Anticipating a successful partnership. Regards, Jason McLachlan
$2.100 USD på 3 dage
3,2
3,2

Hello, I have carefully reviewed the project description and fully understand what you need and why this project is sensitive and important to you. The goal of this project is to build a precise and reliable Safety Dataset for two languages Hindi and Indonesian (Bahasa Indonesia) covering five main data types: Sensitive words , Text samples ,Images ,Image–text pairs , Video–text pairs. All of this data must be manually reviewed and annotated with high accuracy, in compliance with safety and sensitive content standards. The final deliverables should be provided in Excel and JSONL formats, along with the corresponding organized file folders. My understanding of the project is that your focus is not merely on delivering data, but on data quality, consistency, and usability for Safety Hardening processes. For this reason, my approach is to build a dataset that is clean, well structured, and reliable, rather than simply filling files. My Working Approach: Data will either be provided directly by you or collected from specified sources Each item (text, image, or video) will be reviewed manually Precise annotation will be applied according to your guidelines or taxonomy The final output will be delivered in a format that is ready for use without requiring revisions I have strong expertise in Python and extensive hands-on experience in data collection, scripting, data processing, and structured dataset preparation. If needed, I can also develop scripts to extract data from websites and social media platforms. In addition to technical skills, due to the sensitive nature of this project, I place the highest priority on human judgment, focus, accuracy, and confidentiality. Proposed Project Phases: Phase 1 – Data Preparation: Receiving or collecting the data Reviewing data structure and formats Categorizing data by type (text, image, video, etc.) Phase 2 – Annotation and Processing: Manual review of content for safety and sensitivity Applying labels according to your guidelines Generating standardized Excel and JSONL files Phase 3 – Final Delivery and Quality Control: Final quality and consistency checks Full delivery of files along with related folders Applying any necessary revisions until the output fully meets your expectations To start the project efficiently and accurately, I would appreciate your answers to the following questions: What is the approximate volume of the data? What are the data sources? Do you have specific annotation guidelines or a taxonomy? Will the data be fully provided, or is data extraction also required? What is the project timeline and priority? I am committed to remaining available after delivery to support any required revisions or optimizations, ensuring the final output matches your expectations precisely. I would be happy to begin the discussion and coordinate the project details more closely. thank you.
$1.500 USD på 5 dage
2,3
2,3

Hello, I can handle the 1,876 Indonesian items (including sensitive words, text, images, image-text, and video-text). I understand the annotation requirements, formatting (Excel & JSONL), and quality standards (≥95% accuracy). I will complete this batch within 8–10 days and can deliver in stages to ensure quality. I’m ready to start immediately once the milestone is funded. Thank you.
$206 USD på 15 dage
1,5
1,5

Hello, I'm Aditya Prasetya, a Fullstack Developer with a wide range of professional experience in web development and design. My proficiency spans across various sectors - from ERP web and mobile applications in mining, textile, to chemical sales industries. These experiences have endowed me with the meticulousness necessary for your project. I am well-versed in handling large datasets efficiently, maintaining data integrity, and delivering structured results that adhere to project requirements. Moreover, my extensive programming skills including expertise in Python and knowledge of databases such as MySQL, PostgreSQL and MongoDB will allow me to process the raw data, Auto evaluate the completed task in multiple formats (JSONL and Excel) before delivering them to you. With meticulous attention to detail and proficiency in Hindi and Indonesian languages, I believe I'm well-placed not just to complete the task but to add value through accurate labelling and annotations. Let's team up to create valuable safety datasets for Hindi and Indonesian languages! I'm excited about the challenge this project offers, and I assure you of my dedicated commitment to quality workmanship and timely delivery.
$1.800 USD på 14 dage
0,6
0,6

Hi, I’d be happy to contribute. I have experience in Web Scraping, Data Collection, Data Delivery, Data Entry, Data Analysis, Data Management, Data Processing and Excel. I value clear communication and collaboration throughout the project lifecycle. Before starting, I take time to fully understand both the business objectives and technical requirements. My approach focuses on building practical, scalable, and well-documented solutions. I’m comfortable working in iterative cycles and incorporating feedback as the project evolves. I respect deadlines and agreed milestones and take ownership of my deliverables. I can adapt easily to your preferred tools, tech stack, and workflow. My goal is to create long-term value rather than simply completing tasks. I’d welcome the opportunity to discuss your project and next steps. Best regards, Marko O.
$1.500 USD på 2 dage
0,0
0,0

Having over [X] years of experience as a Virtual Assistant, I am intimately familiar with the level of detailed and meticulous work that data annotation often requires. As an experienced professional, I have honed my skills in Shopify, data entry, and email management, which I believe are crucial for the successful completion of your project. With an exceptional ability to multitask and strong attention to detail, you can trust me to fully deliver on every one of your requirements. Accuracy and efficiency are two key attributes I pride myself on. My past projects have always demanded a keen understanding of language nuances and cultural context, which is paramount for your Hindi-Indonesian Safety Data Annotation project. This prior experience dealing with different languages and phrases has equipped me with a deep understanding of cultural sensitivities - ensuring absolute precision in preventing any unintentional phrases or sensitive words from escaping the net. We are not just technical performers but solution architects. When you choose me as your Virtual Assistant, you gain not only my proficiency but also an unwavering commitment to your success. Together, let's construct comprehensive safety datasets for Hindi and Indonesian that genuinely reflect the cultural context and contribute to safety enhancement!
$2.250 USD på 7 dage
0,0
0,0

Hi, I appreciate the opportunity to work on the Hindi-Indonesian Safety Data Annotation project. You’re looking to create safety datasets by annotating various types of raw data in both languages, which is crucial for ensuring safe content handling. My approach would involve carefully reviewing and categorizing sensitive words, text samples, images, and their corresponding video-text pairs, ensuring accuracy and adherence to your specified formats like Excel and JSONL. With my experience in data processing and management, I can efficiently handle the annotation tasks while maintaining high quality. I'm particularly skilled in handling various data formats and can ensure that everything is organized in folders as required. One question I have is: do you have specific guidelines or criteria for identifying sensitive content in the datasets? Best regards,
$1.750 USD på 21 dage
0,0
0,0

Having delivered several safety alignment datasets for LLM fine-tuning, I recognize the challenges of hardening models against adversarial prompts in Hindi and Indonesian. My experience with RLHF allows me to distinguish between cultural nuances and genuine safety violations, ensuring your dataset provides the edge-case coverage needed for deployment. I have previously managed linguistic tasks identifying "jailbreak" attempts and misinformation that automated filters fail to capture in non-English corpora. By leveraging regional sociopolitical contexts, I ensure data reflects real-world safety risks specific to these regions, providing you with a high-signal dataset that directly addresses the complexities of multi-lingual safety hardening. My approach involves a structured pipeline for data integrity. I map your safety guidelines to linguistic markers, ensuring colloquialisms are accurately categorized within your taxonomy. I utilize a double-blind annotation method and audit to maintain high inter-annotator agreement (IAA), focusing on implicit toxicity. To ensure seamless integration, I deliver datasets in pre-validated JSON formats, adhering to your schema while documenting the rationale behind labeling decisions. This rigor ensures the "hardening" aspect of the dataset is both measurable and reproducible for your training needs, allowing for consistent benchmarking across both the Hindi and Indonesian segments. Are you prioritizing specific domains like PII, or is the focus on general robustness against red-teaming? Do you have a preferred annotation platform, or should I provide a custom solution? I’m available for a quick call to align on the benchmarks you are targeting and the total volume of strings required for this phase of the project. I look forward to helping you secure your model's regional performance.
$2.550 USD på 21 dage
0,0
0,0

With over 8 years of professional experience, including a Ph.D. in Computer Vision, I am confident in my ability to deliver exceptional annotation for your Hindi and Indonesian safety dataset project. My command over traditional and deep learning algorithms for Computer Vision and Image Processing, aided by expertise in Python, MATLAB, C/C++, PyTorch, TensorFlow, and Keras, enable me to efficiently carry out this task. I have worked extensively with image and text datasets, constructing annotations that meet the high data quality standards required for training robust AI models. I am more than capable of adhering to your deliverable types and formats (including sensitve words, text samples, image samples, "image-text" pairs and "video-text" pairs), ensuring utmost precision across all categories. My skills extend beyond just generating annotations. I possess an in-depth understanding of optimizing code for efficient processing on various hardware targets - a valuable asset when it comes to handling large-scale datasets like this one. Additionally, my proficiency with languages such as Hindi and Indonesian makes me adept at handling multilingual projects professionally. Given the opportunity to contribute to your project, rest assured that you will receive unparalleled expertise and dedication to excellence.
$2.250 USD på 7 dage
0,0
0,0

Hi there 1. Overview I hereby propose to deliver a high-quality, manually annotated safety hardening dataset in Hindi and Indonesian, covering sensitive words, text, images, image-text pairs, and video-text pairs. The work will follow strict annotation guidelines, native-language expertise, and multi-layer quality control to ensure accuracy, consistency, and usability for safety modeling. 2. Scope of Work The project includes annotation of the following data types: Sensitive words & phrases Text samples (sentences & paragraphs) Image samples Image–text pairs Video–text pairs All outputs will be delivered in Excel and JSONL formats, with media files organized in structured folders. Sincerely, Busisiwe Shandu
$2.250 USD på 7 dage
0,0
0,0

纽约, China
Betalingsmetode verificeret
Medlem siden okt. 14, 2023
$3000-5000 USD
$10-30 AUD
₹750-1250 INR / time
$30-250 USD
$10-40 USD
$15-25 USD / time
$10-30 USD
$15-30 USD
$250-750 USD
₹1500-12500 INR
£3000-5000 GBP
$30-250 USD
₹400-750 INR / time
$3-10 NZD / time
£10-20 GBP
€6-12 EUR / time
$30-250 USD
₹12500-37500 INR
$10-30 USD
$15-25 USD / time
$10-20 USD