2

Is it possible to index documents offline in native android app using lucene?

We build it for web but looking for something which will work offline in native android apps.

sample data:

[ { "name":"abc", "desc":"ndex documents offline" }, { "name":"jjj", "desc":"index my data" } ]

I have to index my data and search from it

Analyzer code:

//  Directory dir = FSDirectory.open("/libs/g");
//Analyzer analyzer = new StandardAnalyzer();
// IndexWriterConfig iwc = new IndexWriterConfig();
// Analyzer analyzer = new StandardAnalyzer();

Directory directory = new RAMDirectory();
// To store an index on disk, use this instead:
//Directory directory = FSDirectory.open("/tmp/testindex");
IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_4_10_4,new Analyzer(){

protected TokenStreamComponents createComponents(String fieldName, Reader reader) {
    return null;
    }
});
config.setOpenMode(OpenMode.CREATE_OR_APPEND);
indexWriter = new IndexWriter(directory, config);

//Always overwrite the directory
//iwriter.setOpenMode(OpenMode.CREATE);
//indexWriter = new IndexWriter(dir, iwc);
vikrant
  • 141
  • 1
  • 12

2 Answers2

2

use this as a base http://www.avajava.com/tutorials/lessons/how-do-i-use-lucene-to-index-and-search-text-files.html

and use this to add JSON objects to the index

public void addDocuments(IndexWriter indexWriter, JSONArray jsonObjects) {
    for (JSONObject object : (List<JSONObject>) jsonObjects) {
        Document doc = new Document();
        final FieldType bodyOptions = new FieldType();
        bodyOptions.setIndexed(true);
        bodyOptions.setIndexOptions(FieldInfo.IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
        bodyOptions.setStored(true);
        bodyOptions.setStoreTermVectors(true);
        bodyOptions.setTokenized(true);
        for (String field : (Set<String>) object.keySet()) {
            doc.add(new Field(field, (String) object.get(field), bodyOptions));
        }
        try {
            System.out.println(doc);
            indexWriter.addDocument(doc);
        } catch (IOException ex) {
            System.err.println("Error adding documents to the index. " + ex.getMessage());
        }
    }
}

you might need to add the following dependency

<!-- https://mvnrepository.com/artifact/org.apache.lucene/lucene-analyzers-common -->
<dependency>
    <groupId>org.apache.lucene</groupId>
    <artifactId>lucene-analyzers-common</artifactId>
    <version>4.10.4</version>
</dependency>
Japu_D_Cret
  • 632
  • 5
  • 18
  • 1
    i am using lucene 4.10.4 and getting error in StandardAnalyzer() Error:(59, 33) error: cannot find symbol class StandardAnalyzer – vikrant Mar 27 '17 at 14:16
  • 1
    @vikrant can you show us your code so far? Please add includes ; Also here is the documentation for your version of the StandardAnalyzer https://lucene.apache.org/core/4_10_4/analyzers-common/org/apache/lucene/analysis/standard/StandardAnalyzer.html – Japu_D_Cret Mar 27 '17 at 14:19
  • 2
    @vikrant - Add the analyzers-common jar to your classpath. – femtoRgon Mar 27 '17 at 14:21
  • 1
    @femtoRgon thanks for mentioned that - I forgot it, added the missing dependency accordingly – Japu_D_Cret Mar 27 '17 at 14:22
  • getting error on indexWriter : java.lang.NoSuchMethodError: No virtual method toPath()Ljava/nio/file/Path; in class Ljava/io/File; or its super classes (declaration of 'java.io.File' appears in /system/framework/core-libart.jar) – vikrant Mar 27 '17 at 14:59
  • i think this is android related issue, i am trying to read json file and index it. getting error on "indexWriter = new IndexWriter(directory, indexwriterconfig)" this line. – vikrant Mar 27 '17 at 15:11
  • @vikrant yes it is related - but maybe you can supress the issue causing dependency, for reference http://stackoverflow.com/questions/37415045/gcm-error-googlecloudmessaging-register - maybe try to force that this dependency is a newer version – Japu_D_Cret Mar 27 '17 at 15:40
  • is there any other way to index document offline in android using NLP or Machine Learning? – vikrant Mar 28 '17 at 14:14
  • @vikrant - Lucene 4.10.4 does not use `Path`. Sounds like you have added the wrong version of a jar. Make sure all your Lucene jars are 4.10.4. – femtoRgon Mar 29 '17 at 15:00
  • @femtoRgon it might not, but apparently one of its dependencies and without what it's useless – Japu_D_Cret Mar 29 '17 at 15:01
  • @Japu_D_Cret - Lucene 4.10.4 is made in Java 6. The `Path` class did not exist in Java 6. Also, the error listed says it's coming from `IndexWriter`, a Lucene class, not one of it's dependencies. Sounds like they have the wrong version of the Lucene core jar in the classpath. Might even have two versions of it in the classpath, or something like that. – femtoRgon Mar 29 '17 at 15:05
0

You can use this modification of Lucene 7.3.0 for searching in Android 8.0 or above: https://github.com/texophen/lucene-android

See this answer ( https://stackoverflow.com/a/76719477/22251021 ) for how to use it.

In order to use Lucene in Android, you can:

  • Extends org.apache.lucene.store.BaseDirectory:
package com.texopher.gophoxes;

import java.io.IOException;
import java.nio.file.Path;
import java.util.Collection;
import java.util.List;
import java.util.UUID;

import org.apache.lucene.store.BaseDirectory;
import org.apache.lucene.store.IOContext;
import org.apache.lucene.store.IndexInput;
import org.apache.lucene.store.IndexOutput;
import org.apache.lucene.store.LockFactory;

public class BucketDirectory extends BaseDirectory {

    protected Bucket bucket;
    
    public BucketDirectory(Bucket bucket, LockFactory lockFactory) {
        super(lockFactory);
        this.bucket = bucket;
    }

    @Override
    public void close() throws IOException {
    }

    @Override
    public IndexOutput createOutput(String name, IOContext context) throws IOException {
        return new BucketIndexOutput(bucket, name);
    }

    @Override
    public IndexOutput createTempOutput(String prefix, String suffix, IOContext context) throws IOException {
        return new BucketIndexOutput(bucket, prefix + UUID.randomUUID().toString().replaceAll("-", "") + suffix);
    }

    @Override
    public void deleteFile(String name) throws IOException {
        try {
            bucket.deleteFile(name);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    @Override
    public long fileLength(String name) throws IOException {
        try {
            return bucket.fileLength(name);
        } catch (Exception e) {
            e.printStackTrace();
        }
        return 0;
    }

    @Override
    public String[] listAll() throws IOException {
        String[] tag = new String[0];
        try {
            List<String> tmp = bucket.listFiles();
            tag = new String[tmp.size()];
            for (int i = 0; i < tmp.size(); i++) {
                tag[i] = tmp.get(i);
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
        return tag;
    }

    @Override
    public IndexInput openInput(String name, IOContext context) throws IOException {
        try {
            return new BucketIndexInput(bucket, name);
        } catch (Exception e) {
            e.printStackTrace();
        }
        return null;
    }

    @Override
    public void rename(String source, String dest) throws IOException {
        try {
            bucket.rename(source, dest);;
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    @Override
    public void sync(Collection<String> arg0) throws IOException {
    }

    @Override
    public void syncMetaData() throws IOException {
    }

}
  • Use following code for searching:
Query qr = new TermQuery(new Term("code", this.md5(link)));
Bucket bk = new Bucket(this.gopherServer, this.hole, this.magic);
Directory indexDirectory = new BucketDirectory(bk, new SingleInstanceLockFactory());
IndexReader reader = DirectoryReader.open(indexDirectory);
IndexSearcher searcher = new IndexSearcher(reader);
TopDocs docs = searcher.search(qr, 1);
ScoreDoc[] hits = docs.scoreDocs;
texopher
  • 1
  • 3
  • While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. - [From Review](/review/late-answers/34713592) – Chenmunka Jul 23 '23 at 12:45