4

I would like to solve the following in an efficient way:

Given a sequence of integers, assign to each integer a sign (+ or -) in such a way that the sum is equal to zero. For all sequences is is guaranteed that they can be added up to 0.

Example:

original sequence: 1, 3, 5, 2, 1, 4

output: +5, -4, -3, +2, -1, +1

Ideas:

Try every combination one after another. For 6 numbers that would look something like this (just the signs):

++++++
+++++-
++++-+
++++--
and so on...

Try sorting the sequence first. Assign a + to the first number, then subtract until you are negative, then add again until you are positive.

first sort: 
5, 4, 3, 2, 1, 1
+5 (sum = 5) 
+5, -4 (sum = 1) 
+5, -4, -3 (sum = -2)
+5, -4, -3, +2 (sum = 0) 
+5, -4, -3, +2, -1 (sum = -1)
+5, -4, -3, +2, -1, +1 (sum = 0)

Is there a better way to solve this? Does the second one make sens or are there possibilities where this would not work (under the premise that you can add up the seq to 0)?

User12547645
  • 6,955
  • 3
  • 38
  • 69
  • Efficient solution depends upon how big the summation of all integers could be. Is there any constraint for summation ? – BishalG Nov 26 '18 at 10:27
  • Each integer is between 100k and 1 Mio – User12547645 Nov 26 '18 at 10:43
  • How many integers are there? – BishalG Nov 26 '18 at 10:47
  • @BishalGautam We do not know that. Could be 50 or 50.000 – User12547645 Nov 26 '18 at 10:49
  • 1
    That is called the [partition problem](https://en.wikipedia.org/wiki/Partition_problem), and it is actually NP-complete, although several heuristics exist for it. The one you propose is one of them but, as the Wikipedia page notes, it may fail in some cases such as `{4, 5, 6, 7, 8}`. – jdehesa Nov 26 '18 at 10:57
  • (to be precise, the partition problem is to decide whether a sequence of numbers can be divided in such way, which, in your case, is a given, but finding the right split is still a problem) – jdehesa Nov 26 '18 at 10:59
  • DP(Dynamic Programming) would be the best way to solve this problem. If you already know how dp works try to implement it using dp otherwise first learn about dp. Happy Coding :) – Md Golam Rahman Tushar Nov 26 '18 at 11:22
  • IMO, your greedy proposal is a non-solution, because nothing tells you that in the end you will reach zero. And if not, you need to backtrack and this is where things get harder. –  Nov 26 '18 at 11:29

4 Answers4

1

In case of your first idea:
Your first idea of trying every possible combination one after another and checking summation will definitely work but the problem is that complexity will be very high. For this, we can simply do like this:

bool recursion(int pos, int n, int sum, vector<int>&sequence) {
  if (pos == n) {
     if (sum == 0) return true;
     else return false;
   }
   bool resultTakingPositive =  recursion(pos + 1, n, sum + sequence[pos], sequence);
   bool resultTakingNegative =  recursion(pos + 1, n, sum - sequence[pos], sequence);
   if (resultTakingPositive || resultTakingNegative) return true;
   else return false;
}

If there are total n integer numbers, then this solution will take time complexity of O(2^n). Because in every position, there are two options :

  • Take +ve value in summation.
  • Take -ve value in summation.

And, we have to make choice for every n integer numbers. So, n times multiplication of 2 leads to O(2^n) time complexity.

In case of your second idea:
You are trying to sort the sequence first in non-increasing order and assigning +ve sign to the first number, then subtracting until you get negative number, then adding again until you get positive number. Unfortunately, this greedy approach does not work always. For example:
In a sequence: 5, 4, 4, 3, 2
If we try this approach we will have : +5 -4 -4 +3 +2 which leads to summation = 2.
But, we can make summation zero by doing : +5 +4 -4 -3 -2 .

Efficient approach:
We can use memoization in above recursive solution with simple modification so as to allow positive indexing while doing memoization with states being pos and sum. This is also called dynamic programming. For this, the highest possible value of pos * sum should be less so as to cache their states in memory using two dimensional array. So, time and space complexity both will be O(n * sum). Example of this approach using c++ code will be:

#include<bits/stdc++.h>
using namespace std;

bool recursion(int pos, int n, int sum, vector<int>&sequence,int &baseSum,  vector< vector<int> >&dp) {
  if (pos == n) {
     if (sum == baseSum) return true;
     else return false;
  }
  if (dp[pos][sum] != -1) return dp[pos][sum];
  bool resultTakingPositive =  recursion(pos + 1, n, sum + sequence[pos], sequence, baseSum, dp);
  bool resultTakingNegative =  recursion(pos + 1, n, sum - sequence[pos], sequence, baseSum, dp);
  dp[pos][sum] = (resultTakingPositive || resultTakingNegative);
  return dp[pos][sum];
}

int main() {
  vector<int>sequence;
  int n, baseSum = 0;
  scanf("%d",&n);
  for (int i = 1; i <= n; i++) {
     int x;
     scanf("%d",&x);
     sequence.push_back(x);
     baseSum += x;
  }
  vector< vector<int> >dp(n, vector<int>(2*baseSum + 1, -1));
  cout<<recursion(0, n, baseSum, sequence, baseSum, dp)<<endl;
  return 0;
}

Now, If we want to keep track of signs used to make up the summation 0, we may do this by analysing recursive calls as follows:

#include<bits/stdc++.h>
using namespace std;

bool recursion(int pos, int n, int sum, vector<int>&sequence,int &baseSum, vector< vector<int> >&dp) {
  if (pos == n) {
     if (sum == baseSum) return true;
     else return false;
  }
  if (dp[pos][sum] != -1) return dp[pos][sum];
  bool resultTakingPositive =  recursion(pos + 1, n, sum + sequence[pos], sequence, baseSum, dp);
  bool resultTakingNegative =  recursion(pos + 1, n, sum - sequence[pos], sequence, baseSum, dp);
  dp[pos][sum] = (resultTakingPositive || resultTakingNegative);
  return dp[pos][sum];
}

void printSolution(int pos, int n, int sum, vector<int>&sequence,int &baseSum, vector< vector<int> >&dp) {
  if (pos == n) {
    cout<<endl;
    return;
  }
  bool resultTakingPositive =  recursion(pos + 1, n, sum + sequence[pos], sequence, baseSum, dp);
  if (resultTakingPositive == true) {
     cout<< "+ ";
     printSolution(pos + 1, n, sum + sequence[pos], sequence, baseSum, dp);
  } else {
    cout<< "- ";
    printSolution(pos + 1, n, sum - sequence[pos], sequence, baseSum, dp);
  }
}

int main() {
  vector<int>sequence;
  int n, baseSum = 0;
  scanf("%d",&n);
  for (int i = 1; i <= n; i++) {
     int x;
     scanf("%d",&x);
     sequence.push_back(x);
     baseSum += x;
  }
  vector< vector<int> >dp(n, vector<int>(2*baseSum + 1, -1));
  if (recursion(0, n, baseSum, sequence, baseSum, dp)) { // if possible to make sum 0 then
      printSolution(0, n, baseSum, sequence, baseSum, dp);
   }
   return 0;
}
BishalG
  • 1,414
  • 13
  • 24
  • Thank you very much for this answer! The **Efficient approach** really sounds like the solution to my problem. Could you pls provide some pseudo code, so I can understand it completely? – User12547645 Nov 26 '18 at 11:21
  • In your constraint, will there be n * sum able to hold in memory? – BishalG Nov 26 '18 at 11:26
  • Yes. But this solution looks like it would return a bool. So recursive solves IF a linear combination equals to 0 is possible, not what sequence of signs I have to assign to make it possible, right? – User12547645 Nov 26 '18 at 13:04
  • To receive to sequence of signs I would somehow have to keep track of the path taken, which leads to sum == baseSum – User12547645 Nov 26 '18 at 13:12
  • @User12547645, You are right. **Edited** to add the method for printing solution. Thanks !! – BishalG Nov 26 '18 at 13:36
  • 1
    Thank you very much @Bishal Gautam. i will check out your solution later and give you a "solved problem" then. Thanks again, looks really like what I was looking for! – User12547645 Nov 26 '18 at 16:29
1

This is called in computer science an Optimization problem, and finding a solution to them usually means constructing and exploring a solution tree, maybe trying to minimizing the steps to get a solution. But in this case there aren't guarantees to find a solution, in the worst case the algorithm would explore the whole tree.

This kind of algorithms that explore the solution tree in depth are called backtracking algorithms. They are usually recursive and there are also other related well known algoritms families that do another optimizations in adition:

In general these are ideas that could be applied to our backtracking solution, in this case I will try first the partial solution closest to zero. But applying more in depth branch & bound you could find a better algorithm.

Here is my code demonstrating backtracking in javascript, executable in the browser.

/** Returns 
  - false if it can't find a solution
  - An array of signs otherwise 
  */
const findZeroSum = function(problem) {

  const sumElement = (partial, sign, value) => {

    if (sign == '+')
      partial += value;
    else if (sign == '-')
      partial -= value;
    return partial;
  };


  const sortBySubElementAsc = (index) => (a, b) =>
    a[index] - b[index];

  const recursiveFindZeroSum = (partial, index, prob, sol) => {
    const sums = [];
    const signs = '+-';
    const finalElement = ((index + 1) >= prob.length);

    for (let i in signs) {
      let el = signs[i];
      let sum = sumElement(partial, el, prob[index]);

      if (finalElement && sum == 0) {
        // we found a solution!!
        sol[index] = el;
        return sol;
      }
      // store to explore later
      sums.push([el, sum, Math.abs(sum)]);

    }
    if (finalElement) return false;
    // order by the better partial solution
    // (the closest to zero)
    const sortedCandidates = sums.sort(sortBySubElementAsc(2));

    for (let i in sortedCandidates) {
      let el = sortedCandidates[i];
      sol[index] = el[0];
      // go down in the tree
      let partialSol = recursiveFindZeroSum(el[1], index + 1, problem, sol)
      if (partialSol !== false) {
        return partialSol;
      }
    }
    return false;

  };
  const sol = recursiveFindZeroSum(0, 0, problem, []);
  return sol;
  
};
// generate a solution, for testing
const genZeroSum = (start, end) => {
  const res = [];
  let sum = 0;
  for (let i = start; i < end; i++) {
    res.push(i);
    if (Math.random() > 0.5) {
      sum += i;
    } else {
      sum -= i;
    }
  }
  res.push(Math.abs(sum));
  return res;
};

tests = [
  [1, 1, 2, 5],
  [12, 1, 25, 5],
  [12, 12, 25, 1],
  genZeroSum(1, 20),
  genZeroSum(15, 40),
];

tests.forEach((d,i) => {
  console.log("Test "+i);
  console.log(d.join());
  let sol = findZeroSum(d);
  if (sol){
    sol = sol.join(' ');
  }
  console.log(sol);
});
David Lemon
  • 1,560
  • 10
  • 21
0

This is the subset sum problem. You need to compute sum of all elements of array, let at be S. Then you should have subset such as sum of it equals S/2. This is a well-known problem and it is solved by dynamic programming. You could read about this Subset Sum algorithm

Multifora
  • 133
  • 1
  • 10
0

There isn't a "most efficient" algorithm for this problem.

From a theoretical standpoint, the worst case complexity isn't polynomial, so that brute force (trying all signs) is acceptable. And for problems of small size (say less than 20 elements), it could be fast enough.

In practice, many heuristics can be tried, and their behavior will depend on the distribution of the input data.