Any real programmers on BYOND? [Challenge]

Jun 4 2010, 1:27 pm

Just came up with a quick answer, it doesn't have any actual file reading though, and out of laziness, I just used Theodis's QuickSort library. It should work, though, in theory, and maybe I'll add the other stuff later.

Dog/var
    funny
    smart

Bessie/proc/Get_list(list/L, M)
    Quicksort(L, /proc/sortDogs) //sorts in order of highest smartness and funniness to lowest

    . = list()

    var
        i
        ST
        FT

    for(i = 1, i <= M, i++)
        var/Dog/D = L[i]

        if(((ST + D.smart) < 0) || ((FT + D.funny) < 0))
            M++
            continue

        ST += D.smart
        FT += D.funny
        . += D

Jun 4 2010, 1:42 pm

Bootyboy

Hey Airjoe, this is quite an awesome problem... I'm sitting in an airport, gonna just jot stuff down...

First Pass:

Magnitude is defined as x+y.

Quadrants ABCD will be defined as (+x,+y) (+x,-y) (-x,+y) (-x,-y).
Terms sX and sY will represent the summed coordinates of Quadrant A
set TA will hold all QuadA data
Terms bX and bY will represent the summed coordinates of Quadrants B&C of positive magnitude
set TBCp will hold all positive quadrantB&C
set TBCn will hold all negative quadrantB&C
set T1 will hold all QuadB data
set T2 will hold all QuadC data
All data from Quadrant D will be discarded.

if sX+bX>0 and sY+bY>0 then the max set is the union of TA and TBCp.
else a balancing algorithm between quadrants B and C must happen.

Balance algorithm 1:

// Since the absolute maximum magnitude has a coordinate of a negative value,
// Remove low magnitude items with offending coordinate values without sinking the already
// good positive coordinate.
// I am suspicious that this does not capture corner cases properly.

Only one of sX+bX and sY+bY failed -- we will consider the one that passed the "budget",
the other will be considered the "deficit".

Intesect -budget and TBCp, union with TBCn into set T0
Combinitorically add to T0 coordinates in T0 that still abide by -budget.
Intersect +deficit with T0, sort by m

Balance algorithm 2:

// Start over from scratch to evaluate QuadB and QuadC data
// Start with highest magnitude of one quadrant, and iterate though the other quadrant
// by magnitude to find miniture sets of positive coordinates.
// Doesn't appear on the surface to be very efficient. But it's more complete than the first algo.

Sort (-x,+y) by magnitude (m) in set T1
Sort (+x,-y) by magnitude (m) in set T2
Sort (+x,-y) by x in set T3
Sort (+x,-y) by y in set T4
Sort (-x,+y) by x in set T5
Sort (-x,+y) by y in set T6
initialize temporary set T7 and T8

Iterate through T1 until m<0. Starting value for x and y for any one iteration will be called X1 and Y1.
Iterate through T2, Populate T7 with sets that follow criterion ==> x>= X1, y<Y1
Sort T7 to maximize m
If T7 is NULL, then iteration fails and data point it thrown out.
T8 equals the union of T7 and iterated point T1
Remove point from T1

Balance algorithm 3

// Brute force Quadrant B&C by calculating all combinations and maximizing m... n!... horrible run time.

QED

Jun 4 2010, 1:45 pm
Airjoe	Jeff: Would you be willing to submit a DMB that has file reading capabilities? I am reasonably sure your solution won't work perfectly as is, but would like to test. You're on the right track, but you're missing something which I won't hint away just yet.

Jun 4 2010, 2:08 pm

Jeff8500

Airjoe wrote:

Jeff:
Would you be willing to submit a DMB that has file reading capabilities? I am reasonably sure your solution won't work perfectly as is, but would like to test. You're on the right track, but you're missing something which I won't hint away just yet.

Oh, the issue lies in my if(((ST + D.smart) < 0) || ((FT + D.funny) < 0)) line, and the bloc that follows; it won't accept numbers lower than 0 for F or S on the first try. I'll have to think of a way around that.

Jun 4 2010, 2:09 pm
Airjoe	Jeff8500 wrote: Oh, the issue lies in my if(((ST + D.smart) < 0) \|\| ((FT + D.funny) < 0)) line, and the bloc that follows; it won't accept numbers lower than 0 for F or S on the first try. I'll have to think of a way around that. This is true, but I believe it's not the only problem.

Jun 4 2010, 2:16 pm

Garthor

Your "corner case" is outside the bounds of the specifications of the problem: you said 1<N<100, so 100 would be too high.

But, in all seriousness, I'm not seeing any reasonable solution to this. It's obviously in NP-complete (or NP-hard?), so you don't have a polynomial-time solution (or you have a million dollars...). You can potentially optimize slightly by immediately accepting every entirely positive entry (2,1) and discarding every entirely negative entry (-8,-5), but that doesn't actually reduce the upper bound, as there could be none of these.

The closest I can think of is taking every positive and net-positive entry, and then proceeding to either remove net-positive or add net-negative entries, according to least net value lost, until the group is valid. However, that runs into the problem with the set:

(100, -10)
(-5, 1) ... (x10)
(-20, 10)

It'll grab the set of ten (-5, 1) entries, losing 50 points, but that's not an optimal solution.

If, instead, you use the ratio of the two values, you risk overshooting (taking (-2000,1500) in this case).

If, instead, you use the ratio of the two values multiplied by however many you'd need to cover the gap, you run into an issue when you DON'T have enough to cover the gap: same example as before, but this time only nine (-5,1) entries, so you'd take nine and then be forced to take (-20,10) which is obviously suboptimal.

A dynamic programming solution might be possible, but you would be dealing with a table with dimensions 100x200000x200000, which isn't particularly feasible.

<edit> Err, that's not right... trying to rethink how the dynamic programming problem would be set up... </edit>

Really, I'm not seeing anything more efficient. The parallels with the knapsack problem are obvious (knapsack easily reduces to this) but it's confounded enough that you can't reduce this to the knapsack problem.

Jun 4 2010, 2:26 pm

T3h B4tman

I wrote my solution in Javascript, basically:

First, filter out any dogs with a value (smartness + funniness) of < 1. They're of no use to you.

Second, continue through list adding dogs that will not bring either category below 0 for the group. If at any point you encounter one, simply search for another dog (or collection of dogs) to offset it's weight. If you can't balance it's weight, it's of no use to you and is tossed aside.

Jun 4 2010, 2:31 pm (Edited on Jun 4 2010, 2:51 pm)

Geldonyetich

Garthor wrote:

But, in all seriousness, I'm not seeing any reasonable solution to this.

[...]

Really, I'm not seeing anything more efficient. The parallels with the knapsack problem are obvious (knapsack easily reduces to this) but it's confounded enough that you can't reduce this to the knapsack problem.

For once, we're in agreement. The best I can come up with is a net-sum sort, but this is confounded by the requirement that the smartness and funiness sums cannot be negative. (As you said, a 'knapsack' problem.)

To get around that, you can then attempt to bring the smartness/funiness sums up to the acceptable value by shifting the list. Perhaps three lists (funniest, smartest, best sum) and a compromise approach. Perhaps attempting to swap the most problematic negatives that have the least positive inputs.

However, no matter what you do, it runs the risk of not producing as accurate of a result as the "bruteforce" solution unless you know the trick. I'm sure somebody has worked out an absolutely marvelous, foolproof solution. It's an obscure answer you'd dig out a list of already derived answers, and as such this is a bit of a trick question.

In practice, given my programming methodology, I would prefer the brute force solution. Up to 1.2676506e30 solutions, you say? (99! is even more than that.) Let the goddamn CPU sit on it, that's its job. To an extent, I disagree with Besse's methods, maybe she should be willing to accept she has a batch of smart-but-unfunny or funny-but-not-smart dogs and play to their strengths.

Jun 4 2010, 2:36 pm

Tom

Interesting problem. Just thinking about it, my instincts are the same as Garthor's: this seems an awful lot like the knapsack problem. That doesn't necessarily mean it is intractable-- there are plenty of examples of slight variations of NP problems that are perfectly solvable.

I do think it's important for you to specify if an optimal solution is guaranteed in polynomial time or if we are just looking at a heuristic here. Unless people want to go about chasing windmills :)

T3h B4tman-- it's more complicated than that because the ordering matters. You might take a dog with a negative contribution to offset a larger positive one, eg:

(1000,-1) + (-999,1)

Finally, if this is the "easiest" problem in a programming competition, I'd be curious to whom this is directed towards!

Jun 4 2010, 2:45 pm

Garthor

Well, as I said, it reduces to the knapsack problem rather simply. Given a knapsack problem, construct the dog problem like so:

Start with one dog, (0, X), where X is the size of the knapsack.
For each object, create a dog (V, -W), where V is the value and W is the weight.

The solution to that would be the solution to the knapsack problem. So, there cannot be a solution better than the best solution to the knapsack problem (unless P=NP in which case you get a million dollars and destroy the world's economy). So, pseudo-polynomial is the best case, and that would probably use dynamic programming (unless there's another solution to the knapsack problem I'm unaware of?). The only formulation I can think of for a dynamic programming solution would require an absurdly huge table due to the requirements of the problem.

Jun 4 2010, 2:49 pm

Airjoe

As usual, Garthor walks in and shows everyone up. The set he provided threw off my algorithm which had been previously working on every set I threw at it (and in fact worked better than the "correct" solution provided by the judges of the contest).

As these questions were part of a contest, I don't think they are in fact insolvable, but as I said, considering my solution worked better than theirs, it's very possible they didn't fully understand the problem they created.

@Tom: The contest was held at Rensselaer Polytechnic Institute for undergraduate students. There were six problems, and I believe this was the easiest of the six, though I might be wrong (especially considering these recent developments). I'm going to try and work on this for a little longer. Either way, we'll be seeing some more problems in the future!

Jun 4 2010, 2:49 pm

Bootyboy

Garthor wrote:

The closest I can think of is taking every positive and net-positive entry, and then proceeding to either remove net-positive or add net-negative entries, according to least net value lost, until the group is valid. However, that runs into the problem with the set:

(100, -10)
(-5, 1) ... (x10)
(-20, 10)

Repeated data can be filtered -- here's a better corner case that I cannot find a way to solve other than NP-complete:

(750,-75)
25 points (-6=>-10, 1=>5)
(-20,75)

And as Garthor correctly pointed out, the upper bound cannot be optimized. However, to reiterate Airjoe's point, the real world has more examples than just the upper bound.

Jun 4 2010, 3:05 pm
Tom	Garthor wrote: Start with one dog, (0, X), where X is the size of the knapsack. For each object, create a dog (V, -W), where V is the value and W is the weight. Very clever!

Jun 4 2010, 3:17 pm
Bootyboy	Tom wrote: Garthor wrote: Start with one dog, (0, X), where X is the size of the knapsack. For each object, create a dog (V, -W), where V is the value and W is the weight. Very clever! Not entirely correct, dog 1(or set of dogs 1) must be in +/+ territory. Takes a little longer than "knapsack optimal."

Jun 4 2010, 3:20 pm

Airjoe

Bootyboy wrote:

Repeated data can be filtered -- here's a better corner case that I cannot find a way to solve other than NP-complete:

(750,-75)
25 points (-6=>-10, 1=>5)
(-20,75)

Bootyboy, do you mean a set as follows?
(750,-75)
(-6,1)
(-6,2)
(-6,3)
(-6,4)
(-6,5)
(-7,1)
(-7,2)
(-7,3)
(-7,4)
(-7,5)
(-8,1)
(-8,2)
(-8,3)
(-8,4)
(-8,5)
(-9,1)
(-9,2)
(-9,3)
(-9,4)
(-9,5)
(-10,1)
(-10,2)
(-10,3)
(-10,4)
(-10,5)
(-20,75)

?

Jun 4 2010, 3:23 pm
Tom	The point is that you can take a knapsack problem (in NP) and convert it to a dog problem, so if you can solve the dog problem, you can solve the knapsack problem. = chasing windmills

Jun 4 2010, 3:23 pm

Bootyboy

Airjoe wrote:

Bootyboy wrote:

Repeated data can be filtered -- here's a better corner case that I cannot find a way to solve other than NP-complete:

(750,-75)
25 points (-6=>-10, 1=>5)
(-20,75)

Bootyboy, do you mean a set as follows?
(750,-75)
(-6,1)
(-6,2)
(-6,3)
(-6,4)
(-6,5)
(-7,1)
(-7,2)
(-7,3)
(-7,4)
(-7,5)
(-8,1)
(-8,2)
(-8,3)
(-8,4)
(-8,5)
(-9,1)
(-9,2)
(-9,3)
(-9,4)
(-9,5)
(-10,1)
(-10,2)
(-10,3)
(-10,4)
(-10,5)
(-20,75)

?

Yeah... you can create the effective "inverse" of that dataset by changing (-20,75) to (-200,75).

Jun 4 2010, 3:25 pm
Airjoe	Bootyboy wrote: Yeah... you can create the effective "inverse" of that dataset by changing (-20,75) to (-200,75). Did you manually calculate what an expected answer might be? I fixed my implementation and it works for Garthor's example now, and I'm coming up with 730 for yours.

Jun 4 2010, 3:27 pm
Airjoe	Tom wrote: The point is that you can take a knapsack problem (in NP) and convert it to a dog problem, so if you can solve the dog problem, you can solve the knapsack problem. = chasing windmills You never know, Tom- this could be BYOND's big break! ;] Really I'm just having some fun at this point. Leave it up to students to run a contest and you get questions in NP :[

Jun 4 2010, 3:29 pm (Edited on Jun 4 2010, 3:41 pm)

Geldonyetich

No wonder I gave it so much thought, I'm a specialist at chasing windmills. ;)

I think the solution is a division by zero solution. Bessie needs to logically understand that what she's trying to do is a fair impossibility (unless she's really patient or has booked some hours on a supercomputer) and she needs to develop rules to compensate (e.g. settle for a reduced criteria).

Besides, the girl's in a bit of trouble thinking she can measure funniness and smartness in a numeric range. By the time she's done taking her measurements of all the dogs, I dare say one of them will have had a change of mood. All she really needs is a good approximation of a few top contenders from several measurements over a period of time, which should eliminate the majority of the sample set.

Don't look and what I'm saying here and say, "geeze, that dude's bad at math." No, what I'm actually saying is that being overly beholden to a mathematic methodology is introducing a problem.

This is why we need both kinds of programming. That where you can write efficient code for computers (wholly mathematical machines) and that where you can understand how to logically bridge the analog gaps.